Commits · 6f464e76cf5dd2c3fbb6b7ea84d00e21ee88cdbf · Verlässliche Systemsoftware / projects / osv

Aug 26, 2013

mmu: don't pass really bad faults to the application · 6f464e76

Avi Kivity authored 11 years ago

Trying to execute the null pointer, or faults within the kernel code, are
a really bad sign and it's better to abort early with them.

6f464e76

alloctracker: Fix forget() if remember() hasn't been called · 0affe14a

Pekka Enberg authored 11 years ago

If leak detector is enabled after OSv startup, the first call can be to
free(), not malloc(). Fix alloctracker::forget() to deal with that.

Fixes the SIGSEGV when "osv leak on" is used to enable detection from
gdb after OSv has started up:

  #
  # A fatal error has been detected by the Java Runtime Environment:
  #
  #  SIGSEGV (0xb) at pc=0x00000000003b8ee6, pid=0, tid=18446673706168635392
  #
  # JRE version: 7.0_25
  # Java VM: OpenJDK 64-Bit Server VM (23.7-b01 mixed mode linux-amd64 compressed oops)
  # Problematic frame:
  # C  0x00000000003b8ee6
  #
  # Core dump written. Default location: //core or core.0
  #
  # An error report file with more information is saved as:
  # /tmp/jvm-0/hs_error.log
  #
  # If you would like to submit a bug report, please include
  # instructions on how to reproduce the bug and visit:
  #   http://icedtea.classpath.org/bugzilla
  #
  Aborted

  [penberg@localhost osv]$ addr2line -e build/debug/loader.elf
  0x00000000003b8ee6
  /home/penberg/osv/build/debug/../../core/alloctracker.cc:90

0affe14a

Aug 25, 2013

rcu: fix hang due to race while awaiting a quiescent state · ac7a8447

Avi Kivity authored 11 years ago

Waiting for a quiescent state happens in two stages: first, we request all
cpus to schedule at least once. Then, we wait until they do so.

If, between the two stages, a cpu is brought online, then we will request
N cpus to schedule but wait for N+1 to respond. This of course never happens,
and the system hangs.

Fix by copying the vector which holds the cpus which we signal and wait for;
forcing them to be consistent. This is safe since newly-added cpus cannot
be accessing any rcu-protected variables before we started signalling.

Fixes random hangs with rcu, mostly seen with 'perf callstack'

ac7a8447

Aug 19, 2013
- dhcp: convert to WITH_LOCK · eed5bafd
  Avi Kivity authored 11 years ago
  
  eed5bafd
Aug 18, 2013

dhcp: allow broadcast responses · d9a5ed59
Avi Kivity authored 11 years ago
```
the QEMU DHCP server responds with broadcast packets; allow them.
```
d9a5ed59

sched: reduce wakeup IPIs further · 5b05bade

Avi Kivity authored 11 years ago

Following 71fec998, we note that if any bit in the wakeup mask
is set, then an IPI to that cpu is either imminent or already in flight, and
we can elide our own IPI to that cpu.

5b05bade

Aug 16, 2013

sched: Avoid IPIs in thread::wake() · 71fec998

Pekka Enberg authored 11 years ago

Avoid sending an IPI to a CPU that's already being woken up by another
IPI.  This reduces IPIs by 17% for a cassandra-stress run. Execution
time is obviously unaffected because execution is bound by lock
contention.

Before:

[penberg@localhost ~]$ sudo perf kvm stat -e kvm:* -p `pidof qemu-system-x86_64`
^C
 Performance counter stats for process id '610':

         6,909,333 kvm:kvm_entry
                 0 kvm:kvm_hypercall
                 0 kvm:kvm_hv_hypercall
         1,035,125 kvm:kvm_pio
                 0 kvm:kvm_cpuid
         5,149,393 kvm:kvm_apic
         6,909,369 kvm:kvm_exit
         2,108,440 kvm:kvm_inj_virq
                 0 kvm:kvm_inj_exception
               982 kvm:kvm_page_fault
         2,783,005 kvm:kvm_msr
                 0 kvm:kvm_cr
             7,354 kvm:kvm_pic_set_irq
         2,366,388 kvm:kvm_apic_ipi
         2,468,569 kvm:kvm_apic_accept_irq
         2,067,044 kvm:kvm_eoi
         1,982,000 kvm:kvm_pv_eoi
                 0 kvm:kvm_nested_vmrun
                 0 kvm:kvm_nested_intercepts
                 0 kvm:kvm_nested_vmexit
                 0 kvm:kvm_nested_vmexit_inject
                 0 kvm:kvm_nested_intr_vmexit
                 0 kvm:kvm_invlpga
                 0 kvm:kvm_skinit
             3,677 kvm:kvm_emulate_insn
                 0 kvm:vcpu_match_mmio
                 0 kvm:kvm_update_master_clock
                 0 kvm:kvm_track_tsc
             7,354 kvm:kvm_userspace_exit
             7,354 kvm:kvm_set_irq
             7,354 kvm:kvm_ioapic_set_irq
               674 kvm:kvm_msi_set_irq
                 0 kvm:kvm_ack_irq
                 0 kvm:kvm_mmio
           609,915 kvm:kvm_fpu
                 0 kvm:kvm_age_page
                 0 kvm:kvm_try_async_get_page
                 0 kvm:kvm_async_pf_doublefault
                 0 kvm:kvm_async_pf_not_present
                 0 kvm:kvm_async_pf_ready
                 0 kvm:kvm_async_pf_completed

      81.180469772 seconds time elapsed

After:

[penberg@localhost ~]$ sudo perf kvm stat -e kvm:* -p `pidof qemu-system-x86_64`
^C
 Performance counter stats for process id '30824':

         6,411,175 kvm:kvm_entry                                                [100.00%]
                 0 kvm:kvm_hypercall                                            [100.00%]
                 0 kvm:kvm_hv_hypercall                                         [100.00%]
           992,454 kvm:kvm_pio                                                  [100.00%]
                 0 kvm:kvm_cpuid                                                [100.00%]
         4,300,001 kvm:kvm_apic                                                 [100.00%]
         6,411,133 kvm:kvm_exit                                                 [100.00%]
         2,055,189 kvm:kvm_inj_virq                                             [100.00%]
                 0 kvm:kvm_inj_exception                                        [100.00%]
             9,760 kvm:kvm_page_fault                                           [100.00%]
         2,356,260 kvm:kvm_msr                                                  [100.00%]
                 0 kvm:kvm_cr                                                   [100.00%]
             3,354 kvm:kvm_pic_set_irq                                          [100.00%]
         1,943,731 kvm:kvm_apic_ipi                                             [100.00%]
         2,047,024 kvm:kvm_apic_accept_irq                                      [100.00%]
         2,019,044 kvm:kvm_eoi                                                  [100.00%]
         1,949,821 kvm:kvm_pv_eoi                                               [100.00%]
                 0 kvm:kvm_nested_vmrun                                         [100.00%]
                 0 kvm:kvm_nested_intercepts                                    [100.00%]
                 0 kvm:kvm_nested_vmexit                                        [100.00%]
                 0 kvm:kvm_nested_vmexit_inject                                 [100.00%]
                 0 kvm:kvm_nested_intr_vmexit                                   [100.00%]
                 0 kvm:kvm_invlpga                                              [100.00%]
                 0 kvm:kvm_skinit                                               [100.00%]
             1,677 kvm:kvm_emulate_insn                                         [100.00%]
                 0 kvm:vcpu_match_mmio                                          [100.00%]
                 0 kvm:kvm_update_master_clock                                  [100.00%]
                 0 kvm:kvm_track_tsc                                            [100.00%]
             3,354 kvm:kvm_userspace_exit                                       [100.00%]
             3,354 kvm:kvm_set_irq                                              [100.00%]
             3,354 kvm:kvm_ioapic_set_irq                                       [100.00%]
               927 kvm:kvm_msi_set_irq                                          [100.00%]
                 0 kvm:kvm_ack_irq                                              [100.00%]
                 0 kvm:kvm_mmio                                                 [100.00%]
           620,278 kvm:kvm_fpu                                                  [100.00%]
                 0 kvm:kvm_age_page                                             [100.00%]
                 0 kvm:kvm_try_async_get_page                                   [100.00%]
                 0 kvm:kvm_async_pf_doublefault                                 [100.00%]
                 0 kvm:kvm_async_pf_not_present                                 [100.00%]
                 0 kvm:kvm_async_pf_ready                                       [100.00%]
                 0 kvm:kvm_async_pf_completed

      79.947992238 seconds time elapsed

71fec998

mempool: Fix GPF in debug realloc() · ba81e15a

Pekka Enberg authored 11 years ago

Starting up Cassandra with debug memory allocator GPFs as follows:

  Breakpoint 1, abort () at ../../runtime.cc:85
  85	{
  (gdb) bt
  #0  abort () at ../../runtime.cc:85
  #1  0x0000000000375812 in osv::generate_signal (siginfo=..., ef=ef@entry=0xffffc0003ffe3008) at ../../libc/signal.cc:40
  #2  0x000000000037587c in osv::handle_segmentation_fault (addr=addr@entry=18446708889768681440, ef=ef@entry=0xffffc0003ffe3008)
      at ../../libc/signal.cc:55
  #3  0x00000000002fba02 in page_fault (ef=0xffffc0003ffe3008) at ../../core/mmu.cc:876
  #4  <signal handler called>
  #5  dbg::realloc (v=v@entry=0xffffe00019b3e000, size=size@entry=16) at ../../core/mempool.cc:846
  #6  0x000000000032654c in realloc (obj=0xffffe00019b3e000, size=16) at ../../core/mempool.cc:870
  #7  0x0000100000627743 in ?? ()
  #8  0x00002000001fe770 in ?? ()
  #9  0x00002000001fe780 in ?? ()
  #10 0x00002000001fe710 in ?? ()
  #11 0x00002000001fe700 in ?? ()
  #12 0xffffe000170e8000 in ?? ()
  #13 0x0000000200000001 in ?? ()
  #14 0x0000000000000020 in ?? ()
  #15 0x00002000001ffe70 in ?? ()
  #16 0xffffe000170e0004 in ?? ()
  #17 0x000000000036f361 in strcpy (dest=0x100001087420 "", src=<optimized out>) at ../../libc/string/strcpy.c:8
  #18 0x0000100000629b53 in ?? ()
  #19 0xffffe00019b22000 in ?? ()
  #20 0x0000000000000001 in ?? ()
  #21 0x0000000000000000 in ?? ()

The problem was introduced in commit 1ea5672f ("memory: let the debug
allocator mimic the standard allocator more closely") which forgot
to convert realloc() to use 'pad_before'.

ba81e15a

Aug 15, 2013

mempool: workaround for unaligned allocations · c19c8aec

Avi Kivity authored 11 years ago

An allocation that is larger than half a page, but smaller than a page,
will end up badly aligned.

Work around it by using the large allocators for objects larger than half
a page.  This is wasteful and slow but at least it works.

Later we can improve this by moving the slab header to the end of the page,
so it doesn't interfere with alignment.

c19c8aec

mempool: Fix refill_page_buffer() on out-of-memory · 2149839e

Pekka Enberg authored 11 years ago

Building OSv with debug memory allocator enabled:

  $ make -j mode=debug conf-preempt=0 conf-debug_memory=1

Causes the guest to enter a busy loop right after JVM starts up:

  $ ./scripts/run.py -d

  [...]

  OpenJDK 64-Bit Server VM warning: Can't detect initial thread stack location - find_vma failed

GDB explains:

  #0  0x00000000003b5c54 in
boost::intrusive::rbtree_impl<boost::intrusive::setopt<boost::intrusive::detail::member_hook_traits<memory::page_range,
boost::intrusive::set_member_hook<boost::intrusive::none,
boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>,
&memory::page_range::member_hook>, memory::addr_cmp, unsigned long, true>
>::private_erase (this=0x1d2f8c8 <memory::free_page_ranges+8>, b=..., e=...,
n=@0x3b40e9: 6179885759521391432) at
../../external/misc.bin/usr/include/boost/intrusive/rbtree.hpp:1417
  #1  0x00000000003b552e in
boost::intrusive::rbtree_impl<boost::intrusive::setopt<boost::intrusive::detail::member_hook_traits<memory::page_range,
boost::intrusive::set_member_hook<boost::intrusive::none,
boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>,
&memory::page_range::member_hook>, memory::addr_cmp, unsigned long, true>
>::erase<memory::page_range, memory::addr_cmp>(memory::page_range const&,
memory::addr_cmp,
boost::intrusive::detail::enable_if_c<!boost::intrusive::detail::is_convertible<memory::addr_cmp,
boost::intrusive::tree_iterator<boost::intrusive::rbtree_impl<boost::intrusive::setopt<boost::intrusive::detail::member_hook_traits<memory::page_range,
boost::intrusive::set_member_hook<boost::intrusive::none,
boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>,
&memory::page_range::member_hook>, memory::addr_cmp, unsigned long, true> >,
true> >::value, void>::type*) (this=0x1d2f8c0 <memory::free_page_ranges>,
key=..., comp=...) at
../../external/misc.bin/usr/include/boost/intrusive/rbtree.hpp:878
  #2  0x00000000003b4c4e in
boost::intrusive::rbtree_impl<boost::intrusive::setopt<boost::intrusive::detail::member_hook_traits<memory::page_range,
boost::intrusive::set_member_hook<boost::intrusive::none,
boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>,
&memory::page_range::member_hook>, memory::addr_cmp, unsigned long, true>
>::erase (this=0x1d2f8c0 <memory::free_page_ranges>, value=...) at
../../external/misc.bin/usr/include/boost/intrusive/rbtree.hpp:856
  #3  0x00000000003b4145 in
boost::intrusive::set_impl<boost::intrusive::setopt<boost::intrusive::detail::member_hook_traits<memory::page_range,
boost::intrusive::set_member_hook<boost::intrusive::none,
boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>,
&memory::page_range::member_hook>, memory::addr_cmp, unsigned long, true>
>::erase (this=0x1d2f8c0 <memory::free_page_ranges>, value=...) at
../../external/misc.bin/usr/include/boost/intrusive/set.hpp:601
  #4  0x00000000003b0130 in memory::refill_page_buffer () at ../../core/mempool.cc:487
  #5  0x00000000003b05f8 in memory::untracked_alloc_page () at ../../core/mempool.cc:569
  #6  0x00000000003b0631 in memory::alloc_page () at ../../core/mempool.cc:577
  #7  0x0000000000367a7c in mmu::populate::small_page (this=0x2000001fd460, ptep=..., offset=0) at ../../core/mmu.cc:456
  #8  0x0000000000365b00 in mmu::page_range_operation::operate_page
(this=0x2000001fd460, huge=false, addr=0xffffe0004ec9b000, offset=0) at
../../core/mmu.cc:438
  #9  0x0000000000365790 in mmu::page_range_operation::operate
(this=0x2000001fd460, start=0xffffe0004ec9b000, size=4096) at
../../core/mmu.cc:387
  #10 0x0000000000366148 in mmu::vpopulate (addr=0xffffe0004ec9b000, size=4096) at ../../core/mmu.cc:657
  #11 0x00000000003b0d8d in dbg::malloc (size=16) at ../../core/mempool.cc:818
  #12 0x00000000003b0f32 in malloc (size=16) at ../../core/mempool.cc:854

Fix the problem by checking if free_page_ranges is empty in
refill_page_buffer(). This fixes the busy loop issue and shows what's
really happening:

  OpenJDK 64-Bit Server VM warning: Can't detect initial thread stack location - find_vma failed
  alloc_page(): out of memory
  Aborted

2149839e

Aug 14, 2013

trace: RCU-protect tracepoint_base::probes · 31fe10b3

Pekka Enberg authored 11 years ago

As suggested by Avi, RCU-protect tracepoint_base::probes to make sure
probes are really stopped before the caller accesses the collected
traces.

31fe10b3

rcu: Add rcu_synchronize() API · 59c61f31
Avi Kivity authored 11 years ago

59c61f31

callstack: Fix list iteration in callstack_collector::merge() · eff44bc7

Pekka Enberg authored 11 years ago

Call to erase() invalidates iterators so switch from range-based for
loop to using iterators manually.

This fixes a bug that resulted in JVM crashing on SMP when "perf
callstack" was run:

  #
  # A fatal error has been detected by the Java Runtime Environment:
  #
  #  SIGSEGV (0xb) at pc=0x0000000000328a44, pid=0, tid=18446673706080178176
  #
  # JRE version: 7.0_19
  # Java VM: OpenJDK 64-Bit Server VM (23.7-b01 mixed mode linux-amd64 compressed oops)
  # Problematic frame:
  # C  0x0000000000328a44
  #
  # Core dump written. Default location: //core or core.0
  #
  # An error report file with more information is saved as:
  # /tmp/jvm-0/hs_error.log
  #
  # If you would like to submit a bug report, please include
  # instructions on how to reproduce the bug and visit:
  #   http://icedtea.classpath.org/bugzilla
  #
  Aborted

eff44bc7

Aug 13, 2013

rcu: fix debug build re rcu_read_lock · 14865fa2

Avi Kivity authored 11 years ago

The release build optimizes away references to this object, but the debug
build does not.  Define it.

14865fa2

dhcp: fix random dhcp failures · 26a04985

Avi Kivity authored 11 years ago

We don't initialize the dhcp packets, so some of them get the relay agent IP
set, and the DHCP DISCOVER packets get sent to a random address on the
Internet.  Usually it doesn't have a DHCP server installed, so the guest
does not get configured.

Fix by zero-initializing the packet.

26a04985

Aug 12, 2013

build: link libstdc++, libgcc_s only once · c9e61d4a

Avi Kivity authored 11 years ago

Currently we statically link to libstdc++ and libgcc_s, and also dynamically
link to the same libraries (since the payload requires them).  This causes
some symbols to be available from both the static and dynamic version.

With the resolution order change introduced by 82513d41, we can
resolve the same symbol to different addresses at different times.  This
violates the One Definition Rule, and in fact breaks std::string's
destructor.

Fix by only linking in the libraries statically.  We use ld's --whole-archive
flag to bring in all symbols, including those that may be used by the payload
but not by the kernel.

Some symbols now become duplicates; we drop our version.

c9e61d4a

Aug 11, 2013

rcu: add basic read-copy-update implementation · 94b69794

Avi Kivity authored 11 years ago

This adds fairly basic support for rcu.

Declaring:

   mutex mtx;
   rcu_ptr<my_object> my_ptr;

Read-side:

   WITH_LOCK(rcu_read_lock) {
      const my_object* p = my_ptr.read();
      // do things with *p
      // but don't block!
   }

Write-side:

  WITH_LOCK(mtx) {
    my_object* old = my_ptr.read_by_owner();
    my_object* p = new my_object;
    // ...
    my_ptr.assign(p);
    rcu_dispose(old);  // or rcu_defer(some_func, old);
  }

94b69794

Aug 08, 2013

Dynamic linker - fix crash on SMP · d703ec00

Nadav Har'El authored 11 years ago

This patch fixes the following bug, of CLI & memcached on two vcpus
crashing on startup. The cause of the crash is this: Java is running
two threads. One loads a new shared library (in this example, libnio.so),
and the second thread just running normally and runs some function it hasn't
run before (pthread_cond_destroy()). When our on-demand resolver code tries
to resolve this function name, it iterates over the module list, and sees
libnio.so, but this object hasn't been completely set up yet (we put it in
the list first - see program::add_object()), so looking up a symbol in it
crashes.

Why hasn't this problem been noticed before the recent link-order change?
Because before that change, the half-loaded library was always last in the
list (OSV itself was the first), so existing symbols were always found before
reaching the partially-set-up object. Now OSV, with many symbols, is last, and
the half-set-up object is in the middle, so the problem is common. But it
also could happen previously, if we had unresolved symbols (e.g., weak symbols),
but these were probably rare enough for the bug not to happen in practice.

The fix in this patch is "hacky", because I wanted to avoid restructuring
the whole code. The problem is that the functions called in add_object()
(including relocate_rela(), nested add_object(), etc.) all assume that
they can look up symbols in the being-set-up object, while we don't want
these objects to be visible for other threads. So we do exactly this -
each object gets a "visiblity" field. If "visibility" is 0, all threads
can use this object, but if visibility is a thread pointer, only this
thread searches in this object. So add_object() starts the object
with visibility set to its thread, and only when add_object() is done,
it sets the visibility to 0 so all threads can see it.

While this solves the common bug, not that this patch still leaves
a small room for SMP bugs, because it doesn't add locking to _modules,
so a lookup during an add_object() can see a broken vector for a short
duration. We should fix this remaining problem later, using RCU.

d703ec00

dl_iterate_phdr: Don't pass pointers on the stack to callback · a26fe58c

Nadav Har'El authored 11 years ago

This patch fixes the exception handling bug seen in tst-except.so.

The callback given dl_iterate_phdr (such as _Unwind_IteratePhdrCallback
in libgcc_eh.a used to implement exceptions) may decide to cache previous
information we gave it, as long as the "adds" and "subs" fields are
unchanged.

The bug was that we passed to the callback a on-stack *copy* of the
obj->_phdrs vector, and if the callback saved pointers to that in its
cache, they became invalid on the next call. We need the pointers to
remain valid as long as adds/subs do not change. So we need to pass
the actual obj->_phdrs (which doesn't change after the object's load),
NOT a copy.

Note there's a locking issue remaining here - if someone dlclose()s
an object while the callback is running (and already checked adds/subs)
it can use a stale pointer. This should be fixed separately, probably
by using reference counting on objects.

a26fe58c

dl_iterate_phdr: fill missing adds and subs field · 4d8353ac

Nadav Har'El authored 11 years ago

The callback function passed to dl_iterate_phdr, such as
_UnWind_IteratePhdrCallback (used in libgcc_eh.a to implement exceptions),
may want to cache previous lookups, and wants to know when the list of
iterated modules hasn't changed since the last call to dl_iterate_phdr.
For this, dl_iterate_phdr() is supposed to fill two fields, dlpi_adds
and dlpi_subs, counting the number of times objects were loaded or
unloaded from the program. If both dlpi_subs and dlpi_adds are unchanged,
the callback is guaranteed the list of objects is unchanged.

In the existing code, we forgot to set these two fields, so they got
random values which caused the exception unwinding code to sometimes
cache, and sometime not cache, depending on the phase of the moon.

This patch adds the counting of the correct "subs" and "adds" counters,
and after it exception unwinding will always use its cache (as long
as the list of objects doesn't change).

Note that this does NOT fix the crash in tst-except.so. That is a bug
which appears when caching is enabled (which before this patch happend
randomly), and will be fixed by the next patch.

4d8353ac

remove newline() function for console · 58954552

Glauber Costa authored 11 years ago

Our console write() takes 3 parameters. The last one controls whether or not we
will issue a newline at the end of input. If it is true, we will call the
console's implementation of newline(). It is always passed as false, though.
Remove it and fix the callers.

58954552

Aug 07, 2013
- mempool: Fix formatting in refill_page_buffer() · 6f77fcf0
  Pekka Enberg authored 11 years ago
  
  6f77fcf0
- mempool: Switch to WITH_LOCK · 9cc3da68
  Pekka Enberg authored 11 years ago
  
  9cc3da68
Aug 06, 2013

Condvar: disable wait morphing for old spin-based mutex · 7b20de3c

Nadav Har'El authored 11 years ago

The option to undef LOCKFREE_MUTEX in osv/mutex.h, to enable the old
spin-based mutex, got broken after the addition of the wait morphing
feature.

Wait morphing needs two features which were never added to the spin-based
option, and probably never will (we should in the near future remove the
old mutex implementation): 1. The mutex->send_lock() feature, on which
the wait morphing feature is built, and 2. The ability to add another
field, "user_mutex", to condvar, which means condvar cannot be constraint
in size so libc/pthread.cc must use a pointer, which we only currently
do for the larger LOCKFREE_MUTEX.

So this patch re-adds the "WAIT_MORPHING" compile-time option, and
disables WAIT_MORPHING if LOCKFREE_MUTEX is disabled.

Now OSV can be compiled with #undef LOCKFREE_MUTEX.

7b20de3c

mempool: Fix calloc() on OOM · 12e906a8
Pekka Enberg authored 11 years ago
```
Make calloc() deal with malloc() returning NULL.
```
12e906a8
mempool: Use nullptr instead of NULL · 8722491c
Pekka Enberg authored 11 years ago

8722491c

Aug 05, 2013

elf: fix symbol resolution order · 82513d41

Nadav Har'El authored 11 years ago

This patch fixes the bug in tst-resolve.so, where an OSV symbol (such as
debug()) hides a symbol in the application (a shared object we are running).

We look up symbols in load order - if tst-resolve.so needs libstdc++.so, we
search for symbols in this order - first in tst-resolve.so and then
libstdc++.so. This order is mostly fine. There is one problem though - that
OSV itself is loaded first, so always gets searched first, which is not
really what users expect: Users expect OSV to behave like the glibc library
(searched last), not like the main executable (search first). So this patch
keeps the first-loaded object (OSV itself) last on the search list.

82513d41

sched: poll for longer before HLTing · 4e3177e3

Avi Kivity authored 11 years ago

Increasing the poll time increases the chances that we can avoid the IPI.
Idle host load increase is negligible.

4e3177e3

sched: disable IPIs while polling wakeup queue before idle · 032aa932

Avi Kivity authored 11 years ago

The scheduler polls the wakeup queue when idle for a short time before HLTing
in order to avoid the expensive HLT instruction if a wakeup arrives early.
This patch extends this to also disable remote wakeups during the polling
period, reducing the waking cpu's need to issue an IPI, whicj requires an
exit. This helps synchronous multithreaded workloads, where threads block
and wake each other.

Together with the following patch, netperf throughtput increases from ~17Gbps
to ~19Gbps, and the context switch benchmark improves from

$ run tests/tst-ctxsw.so
       345 colocated
      5761 apart
       633 nopin

to

$ run tests/tst-ctxsw.so
       347 colocated
       598 apart
       467 nopin

032aa932

Aug 04, 2013
- Convert poll.c to poll.cc · e1686fc4
  Nadav Har'El authored 11 years ago
  
  Convert poll.c to poll.cc, and add a few tracepoints.
  e1686fc4
Jul 31, 2013

dhcp: remove M_ZERO from mbuf allocation · 4ac239de

Avi Kivity authored 11 years ago

M_ZERO requests zeroing of the entire mbuf, which clears the fields initialized
by the init function. It only works now because we don't honor M_ZERO.

Remove M_ZERO and replace with bzero() for the packet data only.

4ac239de

callstack: fix uninitialized data in trace object · 4829c0bc

Avi Kivity authored 11 years ago

The trace object contains a list link for the hash table, which needs to be
initialized by the constructor.  However, trace_alloc() did not call the
constructor and returned uninitialized memory.

Fix by calling the constructor.  For simplicity, all of the object's
initialization is moved to the constructor.

4829c0bc

make initialization priorities explicit · f801c763

Glauber Costa authored 11 years ago

I have recently ran into an early init bug due that ended up being tracked
down to a changeset in which the initialization priorities of the constructors
were changed. One of the changed ones was kvmclock, but the change did not
update kvmclock.

I propose we use constants for that. To avoid things like this in the future,
wherever priorities are used, I believe they should come from the same place
so that the order is utterly obvious. To handle that, I am creating the prio.hh
file, and sticking all priority definitions in there.

f801c763

Jul 29, 2013

Change the max per-cpu free page buffer · c2815d3e

Dor Laor authored 11 years ago

When netperf is executed, there is high demand for pages.
Here is a table of values for the max constant and the matching
netperf results:

buf size  Throughput Mbps
1024     16700
 512     16800
 256     15500
 128     14500
  64     13680

c2815d3e

Jul 28, 2013
- dhcp: initial dhcp implementation · fc428fcf
  Guy Zana authored 11 years ago
  
  Use DHCP to discover an IP address per each interface, DHCP packets are being hooked in the networking stack in ip_input and queued for deffered processing using a dhcp worker thread. Sending dhcp packet is done directly over ethrernet (building IP and UDP). There's still alot to be done: setting up lease time, timeouts, more error handling but it's possible to implement these later on.
  fc428fcf
- elf: add error print for unknown relocation · 1a2b4b87
  Guy Zana authored 11 years ago
  
  1a2b4b87
- semaphore: switch to new with_lock() · aa73a263
  Avi Kivity authored 11 years ago
  
  aa73a263
- sched: switch to new with_lock() · df15d4e0
  Avi Kivity authored 11 years ago
  
  df15d4e0
- rwlock: switch to new with_lock() · 714bdb43
  Avi Kivity authored 11 years ago
  
  714bdb43
- mempool: switch to new with_lock() · ebfb17f7
  Avi Kivity authored 11 years ago
  
  ebfb17f7