- Sep 08, 2013
-
-
Nadav Har'El authored
Currently, clock::get()->time() jumps (by system_time(), i.e., the host's uptime) at some point during the initialization. This can be a huge jump (e.g., a week if the host's uptime is a week). Fixing this jump is hard, so we'd rather just tolerate it. reschedule_from_interrupt() handles this clock jump badly. It calculates current_run, the amount of time the current thread has run, to include this jump while the thread was running. In the above example, a run time of a whole week is wrongly attributed to some thread, and added to its vruntime, causing it not to be scheduled again until all other threads yield the CPU. The fix in this patch is to limit the vruntime increase after a long run to max_slice (10ms). Even if a thread runs for longer (or just thinks it ran for longer), it won't be "penalized" in its dynamic priority more than a thread that ran for 10ms. Note that this cap makes sense, as cpu::enqueue already enforces a similar limit on the vruntime "bonus" of a woken thread, and this patch works toward a similar goal (avoid giving one thread a huge bonus because another thread was given a huge penalty). This bug is very visible in the CPU-bound SPECjvm2008 benchmarks, when running two benchmark threads on two virtual cpus. As it happens, the load_balancer() is the one that gets the huge vruntime increase, so it doesn't get to run until no other thread wants to run. Because we start with both CPU-bound threads on the same CPU, and these hardly yield the CPU (and even more rarely are the two threads sleeping at the same time), the load balancer thread on this CPU doesn't get to run, and the two threads remain on the same CPU, giving us halved performance (2-cpu performance identical to 1-cpu performance) and on the host we see qemu using 100% cpu, instead of 200% as expected with two vcpus.
-
Guy Zana authored
-
- Sep 03, 2013
-
-
Avi Kivity authored
In an attempt to be clever, we define irq_lock as an object in an anonymous namespace, so that each translation unit gets its own copy, which is then optimized away, since the object is never touched. But the compiler complains that the object is defined but not used if we include the file but don't use irq_lock. Simplify by only declaring the object there, and defining it somewhere else.
-
- Sep 02, 2013
-
-
Pekka Enberg authored
This adds simple msync() implementation for file-backed memory maps. It uses the newly added 'file_vma' data structure to write out and fsync the msync'd region as suggested by Avi Kivity.
-
Pekka Enberg authored
Add a new 'file_vma' class that extends 'vma'. This is needed to keep track of fileref and offset for file-backed VMAs for msync().
-
- Aug 29, 2013
-
-
Avi Kivity authored
-
- Aug 27, 2013
-
-
Nadav Har'El authored
Commit 65afd075 fixed mincore() to recognize unmapped addresses. However, it used mmu::ismapped() which just checks for mmap()'ed addresses, and doesn't know about malloc()ed memory. This causes trouble for libunwind (which we use for backtrace()) which tests mincore() on an on-stack variable, and for non-pthread threads, this stack might be malloc'ed, not mmap'ed. So this patch adds mmu::isreadable(), which checks that a given memory range is all readable (this memory can be mmapped, malloced, stack, whatever). mincore() now uses that. mmu::isreadable() is implemented, following Avi's idea, by trying to read, with safe_load(), one byte from every page in the range. This approach is faster than page-table-walking especially for one-byte checks (which all libunwind uses anyway), and also very simple.
-
Glauber Costa authored
Most of the performance problems I have found on Xen were due to the fact that we were hitting malloc_large consistently, for allocations that we should be able to service in some other way. Because malloc_large in our implementation is such a bottleneck, it was very useful for me to have separate tracepoints for them. I am then proposing for inclusion.
-
Nadav Har'El authored
Commit 65afd075 that fixed mincore() exposed a deadlock in the leak detector, caused by two threads taking two locks in opposite order: Thread 1: malloc() does alloc_tracker::remember(). This takes the tracker lock and calls backtrace() calling mincore() which takes the vma_list_mutex. Thread 2: mmap() does mmu::allocate() which takes the vma_list_mutex and then through mmu::populate::small_page calls memory::alloc_page() which calls alloc_tracker::remember() and takes the tracker lock. This patch fixes this deadlock: alloc_tracker::remember() will now drop its lock while running backtrace(), as the lock is only needed to protect the allocations[] array. We need to retake the lock after backtrace() completes, to copy the backtrace back to the allocations[] array. Previously, the lock's depth was also (ab)used for avoiding nested allocation tracking (e.g., tracking of memory allocation done inside backtrace() itself), but now that backtrace() is run without the lock, we need a different mechanism - a per-thread "in_tracker" flag, which is turned on inside the alloc_tracker::remember()/forget() methods.
-
- Aug 26, 2013
-
-
Nadav Har'El authored
sched.hh included elf.hh, just so it can refer to the elf::tls_data type. But now that we have rcu.hh which includes sched.hh and therefore elf.hh, if we wish to use rcu in elf.hh (we'll do this in a later patch), we have an include loop mess. So better not include elf.hh from sched.hh, and just declare the one struct we need. After sched.hh no longer includes elf.hh and the dozen includes that it further included, we need to add missing includes to some of the code that included sched.hh and relied on its implict includes.
-
Avi Kivity authored
Trying to execute the null pointer, or faults within the kernel code, are a really bad sign and it's better to abort early with them.
-
Pekka Enberg authored
If leak detector is enabled after OSv startup, the first call can be to free(), not malloc(). Fix alloctracker::forget() to deal with that. Fixes the SIGSEGV when "osv leak on" is used to enable detection from gdb after OSv has started up: # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00000000003b8ee6, pid=0, tid=18446673706168635392 # # JRE version: 7.0_25 # Java VM: OpenJDK 64-Bit Server VM (23.7-b01 mixed mode linux-amd64 compressed oops) # Problematic frame: # C 0x00000000003b8ee6 # # Core dump written. Default location: //core or core.0 # # An error report file with more information is saved as: # /tmp/jvm-0/hs_error.log # # If you would like to submit a bug report, please include # instructions on how to reproduce the bug and visit: # http://icedtea.classpath.org/bugzilla # Aborted [penberg@localhost osv]$ addr2line -e build/debug/loader.elf 0x00000000003b8ee6 /home/penberg/osv/build/debug/../../core/alloctracker.cc:90
-
- Aug 25, 2013
-
-
Avi Kivity authored
Waiting for a quiescent state happens in two stages: first, we request all cpus to schedule at least once. Then, we wait until they do so. If, between the two stages, a cpu is brought online, then we will request N cpus to schedule but wait for N+1 to respond. This of course never happens, and the system hangs. Fix by copying the vector which holds the cpus which we signal and wait for; forcing them to be consistent. This is safe since newly-added cpus cannot be accessing any rcu-protected variables before we started signalling. Fixes random hangs with rcu, mostly seen with 'perf callstack'
-
- Aug 19, 2013
-
-
Avi Kivity authored
-
- Aug 18, 2013
-
-
Avi Kivity authored
the QEMU DHCP server responds with broadcast packets; allow them.
-
Avi Kivity authored
Following 71fec998, we note that if any bit in the wakeup mask is set, then an IPI to that cpu is either imminent or already in flight, and we can elide our own IPI to that cpu.
-
- Aug 16, 2013
-
-
Pekka Enberg authored
Avoid sending an IPI to a CPU that's already being woken up by another IPI. This reduces IPIs by 17% for a cassandra-stress run. Execution time is obviously unaffected because execution is bound by lock contention. Before: [penberg@localhost ~]$ sudo perf kvm stat -e kvm:* -p `pidof qemu-system-x86_64` ^C Performance counter stats for process id '610': 6,909,333 kvm:kvm_entry 0 kvm:kvm_hypercall 0 kvm:kvm_hv_hypercall 1,035,125 kvm:kvm_pio 0 kvm:kvm_cpuid 5,149,393 kvm:kvm_apic 6,909,369 kvm:kvm_exit 2,108,440 kvm:kvm_inj_virq 0 kvm:kvm_inj_exception 982 kvm:kvm_page_fault 2,783,005 kvm:kvm_msr 0 kvm:kvm_cr 7,354 kvm:kvm_pic_set_irq 2,366,388 kvm:kvm_apic_ipi 2,468,569 kvm:kvm_apic_accept_irq 2,067,044 kvm:kvm_eoi 1,982,000 kvm:kvm_pv_eoi 0 kvm:kvm_nested_vmrun 0 kvm:kvm_nested_intercepts 0 kvm:kvm_nested_vmexit 0 kvm:kvm_nested_vmexit_inject 0 kvm:kvm_nested_intr_vmexit 0 kvm:kvm_invlpga 0 kvm:kvm_skinit 3,677 kvm:kvm_emulate_insn 0 kvm:vcpu_match_mmio 0 kvm:kvm_update_master_clock 0 kvm:kvm_track_tsc 7,354 kvm:kvm_userspace_exit 7,354 kvm:kvm_set_irq 7,354 kvm:kvm_ioapic_set_irq 674 kvm:kvm_msi_set_irq 0 kvm:kvm_ack_irq 0 kvm:kvm_mmio 609,915 kvm:kvm_fpu 0 kvm:kvm_age_page 0 kvm:kvm_try_async_get_page 0 kvm:kvm_async_pf_doublefault 0 kvm:kvm_async_pf_not_present 0 kvm:kvm_async_pf_ready 0 kvm:kvm_async_pf_completed 81.180469772 seconds time elapsed After: [penberg@localhost ~]$ sudo perf kvm stat -e kvm:* -p `pidof qemu-system-x86_64` ^C Performance counter stats for process id '30824': 6,411,175 kvm:kvm_entry [100.00%] 0 kvm:kvm_hypercall [100.00%] 0 kvm:kvm_hv_hypercall [100.00%] 992,454 kvm:kvm_pio [100.00%] 0 kvm:kvm_cpuid [100.00%] 4,300,001 kvm:kvm_apic [100.00%] 6,411,133 kvm:kvm_exit [100.00%] 2,055,189 kvm:kvm_inj_virq [100.00%] 0 kvm:kvm_inj_exception [100.00%] 9,760 kvm:kvm_page_fault [100.00%] 2,356,260 kvm:kvm_msr [100.00%] 0 kvm:kvm_cr [100.00%] 3,354 kvm:kvm_pic_set_irq [100.00%] 1,943,731 kvm:kvm_apic_ipi [100.00%] 2,047,024 kvm:kvm_apic_accept_irq [100.00%] 2,019,044 kvm:kvm_eoi [100.00%] 1,949,821 kvm:kvm_pv_eoi [100.00%] 0 kvm:kvm_nested_vmrun [100.00%] 0 kvm:kvm_nested_intercepts [100.00%] 0 kvm:kvm_nested_vmexit [100.00%] 0 kvm:kvm_nested_vmexit_inject [100.00%] 0 kvm:kvm_nested_intr_vmexit [100.00%] 0 kvm:kvm_invlpga [100.00%] 0 kvm:kvm_skinit [100.00%] 1,677 kvm:kvm_emulate_insn [100.00%] 0 kvm:vcpu_match_mmio [100.00%] 0 kvm:kvm_update_master_clock [100.00%] 0 kvm:kvm_track_tsc [100.00%] 3,354 kvm:kvm_userspace_exit [100.00%] 3,354 kvm:kvm_set_irq [100.00%] 3,354 kvm:kvm_ioapic_set_irq [100.00%] 927 kvm:kvm_msi_set_irq [100.00%] 0 kvm:kvm_ack_irq [100.00%] 0 kvm:kvm_mmio [100.00%] 620,278 kvm:kvm_fpu [100.00%] 0 kvm:kvm_age_page [100.00%] 0 kvm:kvm_try_async_get_page [100.00%] 0 kvm:kvm_async_pf_doublefault [100.00%] 0 kvm:kvm_async_pf_not_present [100.00%] 0 kvm:kvm_async_pf_ready [100.00%] 0 kvm:kvm_async_pf_completed 79.947992238 seconds time elapsed
-
Pekka Enberg authored
Starting up Cassandra with debug memory allocator GPFs as follows: Breakpoint 1, abort () at ../../runtime.cc:85 85 { (gdb) bt #0 abort () at ../../runtime.cc:85 #1 0x0000000000375812 in osv::generate_signal (siginfo=..., ef=ef@entry=0xffffc0003ffe3008) at ../../libc/signal.cc:40 #2 0x000000000037587c in osv::handle_segmentation_fault (addr=addr@entry=18446708889768681440, ef=ef@entry=0xffffc0003ffe3008) at ../../libc/signal.cc:55 #3 0x00000000002fba02 in page_fault (ef=0xffffc0003ffe3008) at ../../core/mmu.cc:876 #4 <signal handler called> #5 dbg::realloc (v=v@entry=0xffffe00019b3e000, size=size@entry=16) at ../../core/mempool.cc:846 #6 0x000000000032654c in realloc (obj=0xffffe00019b3e000, size=16) at ../../core/mempool.cc:870 #7 0x0000100000627743 in ?? () #8 0x00002000001fe770 in ?? () #9 0x00002000001fe780 in ?? () #10 0x00002000001fe710 in ?? () #11 0x00002000001fe700 in ?? () #12 0xffffe000170e8000 in ?? () #13 0x0000000200000001 in ?? () #14 0x0000000000000020 in ?? () #15 0x00002000001ffe70 in ?? () #16 0xffffe000170e0004 in ?? () #17 0x000000000036f361 in strcpy (dest=0x100001087420 "", src=<optimized out>) at ../../libc/string/strcpy.c:8 #18 0x0000100000629b53 in ?? () #19 0xffffe00019b22000 in ?? () #20 0x0000000000000001 in ?? () #21 0x0000000000000000 in ?? () The problem was introduced in commit 1ea5672f ("memory: let the debug allocator mimic the standard allocator more closely") which forgot to convert realloc() to use 'pad_before'.
-
- Aug 15, 2013
-
-
Avi Kivity authored
An allocation that is larger than half a page, but smaller than a page, will end up badly aligned. Work around it by using the large allocators for objects larger than half a page. This is wasteful and slow but at least it works. Later we can improve this by moving the slab header to the end of the page, so it doesn't interfere with alignment.
-
Pekka Enberg authored
Building OSv with debug memory allocator enabled: $ make -j mode=debug conf-preempt=0 conf-debug_memory=1 Causes the guest to enter a busy loop right after JVM starts up: $ ./scripts/run.py -d [...] OpenJDK 64-Bit Server VM warning: Can't detect initial thread stack location - find_vma failed GDB explains: #0 0x00000000003b5c54 in boost::intrusive::rbtree_impl<boost::intrusive::setopt<boost::intrusive::detail::member_hook_traits<memory::page_range, boost::intrusive::set_member_hook<boost::intrusive::none, boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>, &memory::page_range::member_hook>, memory::addr_cmp, unsigned long, true> >::private_erase (this=0x1d2f8c8 <memory::free_page_ranges+8>, b=..., e=..., n=@0x3b40e9: 6179885759521391432) at ../../external/misc.bin/usr/include/boost/intrusive/rbtree.hpp:1417 #1 0x00000000003b552e in boost::intrusive::rbtree_impl<boost::intrusive::setopt<boost::intrusive::detail::member_hook_traits<memory::page_range, boost::intrusive::set_member_hook<boost::intrusive::none, boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>, &memory::page_range::member_hook>, memory::addr_cmp, unsigned long, true> >::erase<memory::page_range, memory::addr_cmp>(memory::page_range const&, memory::addr_cmp, boost::intrusive::detail::enable_if_c<!boost::intrusive::detail::is_convertible<memory::addr_cmp, boost::intrusive::tree_iterator<boost::intrusive::rbtree_impl<boost::intrusive::setopt<boost::intrusive::detail::member_hook_traits<memory::page_range, boost::intrusive::set_member_hook<boost::intrusive::none, boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>, &memory::page_range::member_hook>, memory::addr_cmp, unsigned long, true> >, true> >::value, void>::type*) (this=0x1d2f8c0 <memory::free_page_ranges>, key=..., comp=...) at ../../external/misc.bin/usr/include/boost/intrusive/rbtree.hpp:878 #2 0x00000000003b4c4e in boost::intrusive::rbtree_impl<boost::intrusive::setopt<boost::intrusive::detail::member_hook_traits<memory::page_range, boost::intrusive::set_member_hook<boost::intrusive::none, boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>, &memory::page_range::member_hook>, memory::addr_cmp, unsigned long, true> >::erase (this=0x1d2f8c0 <memory::free_page_ranges>, value=...) at ../../external/misc.bin/usr/include/boost/intrusive/rbtree.hpp:856 #3 0x00000000003b4145 in boost::intrusive::set_impl<boost::intrusive::setopt<boost::intrusive::detail::member_hook_traits<memory::page_range, boost::intrusive::set_member_hook<boost::intrusive::none, boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>, &memory::page_range::member_hook>, memory::addr_cmp, unsigned long, true> >::erase (this=0x1d2f8c0 <memory::free_page_ranges>, value=...) at ../../external/misc.bin/usr/include/boost/intrusive/set.hpp:601 #4 0x00000000003b0130 in memory::refill_page_buffer () at ../../core/mempool.cc:487 #5 0x00000000003b05f8 in memory::untracked_alloc_page () at ../../core/mempool.cc:569 #6 0x00000000003b0631 in memory::alloc_page () at ../../core/mempool.cc:577 #7 0x0000000000367a7c in mmu::populate::small_page (this=0x2000001fd460, ptep=..., offset=0) at ../../core/mmu.cc:456 #8 0x0000000000365b00 in mmu::page_range_operation::operate_page (this=0x2000001fd460, huge=false, addr=0xffffe0004ec9b000, offset=0) at ../../core/mmu.cc:438 #9 0x0000000000365790 in mmu::page_range_operation::operate (this=0x2000001fd460, start=0xffffe0004ec9b000, size=4096) at ../../core/mmu.cc:387 #10 0x0000000000366148 in mmu::vpopulate (addr=0xffffe0004ec9b000, size=4096) at ../../core/mmu.cc:657 #11 0x00000000003b0d8d in dbg::malloc (size=16) at ../../core/mempool.cc:818 #12 0x00000000003b0f32 in malloc (size=16) at ../../core/mempool.cc:854 Fix the problem by checking if free_page_ranges is empty in refill_page_buffer(). This fixes the busy loop issue and shows what's really happening: OpenJDK 64-Bit Server VM warning: Can't detect initial thread stack location - find_vma failed alloc_page(): out of memory Aborted
-
- Aug 14, 2013
-
-
Pekka Enberg authored
As suggested by Avi, RCU-protect tracepoint_base::probes to make sure probes are really stopped before the caller accesses the collected traces.
-
Avi Kivity authored
-
Pekka Enberg authored
Call to erase() invalidates iterators so switch from range-based for loop to using iterators manually. This fixes a bug that resulted in JVM crashing on SMP when "perf callstack" was run: # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x0000000000328a44, pid=0, tid=18446673706080178176 # # JRE version: 7.0_19 # Java VM: OpenJDK 64-Bit Server VM (23.7-b01 mixed mode linux-amd64 compressed oops) # Problematic frame: # C 0x0000000000328a44 # # Core dump written. Default location: //core or core.0 # # An error report file with more information is saved as: # /tmp/jvm-0/hs_error.log # # If you would like to submit a bug report, please include # instructions on how to reproduce the bug and visit: # http://icedtea.classpath.org/bugzilla # Aborted
-
- Aug 13, 2013
-
-
Avi Kivity authored
The release build optimizes away references to this object, but the debug build does not. Define it.
-
Avi Kivity authored
We don't initialize the dhcp packets, so some of them get the relay agent IP set, and the DHCP DISCOVER packets get sent to a random address on the Internet. Usually it doesn't have a DHCP server installed, so the guest does not get configured. Fix by zero-initializing the packet.
-
- Aug 12, 2013
-
-
Avi Kivity authored
Currently we statically link to libstdc++ and libgcc_s, and also dynamically link to the same libraries (since the payload requires them). This causes some symbols to be available from both the static and dynamic version. With the resolution order change introduced by 82513d41, we can resolve the same symbol to different addresses at different times. This violates the One Definition Rule, and in fact breaks std::string's destructor. Fix by only linking in the libraries statically. We use ld's --whole-archive flag to bring in all symbols, including those that may be used by the payload but not by the kernel. Some symbols now become duplicates; we drop our version.
-
- Aug 11, 2013
-
-
Avi Kivity authored
This adds fairly basic support for rcu. Declaring: mutex mtx; rcu_ptr<my_object> my_ptr; Read-side: WITH_LOCK(rcu_read_lock) { const my_object* p = my_ptr.read(); // do things with *p // but don't block! } Write-side: WITH_LOCK(mtx) { my_object* old = my_ptr.read_by_owner(); my_object* p = new my_object; // ... my_ptr.assign(p); rcu_dispose(old); // or rcu_defer(some_func, old); }
-
- Aug 08, 2013
-
-
Nadav Har'El authored
This patch fixes the following bug, of CLI & memcached on two vcpus crashing on startup. The cause of the crash is this: Java is running two threads. One loads a new shared library (in this example, libnio.so), and the second thread just running normally and runs some function it hasn't run before (pthread_cond_destroy()). When our on-demand resolver code tries to resolve this function name, it iterates over the module list, and sees libnio.so, but this object hasn't been completely set up yet (we put it in the list first - see program::add_object()), so looking up a symbol in it crashes. Why hasn't this problem been noticed before the recent link-order change? Because before that change, the half-loaded library was always last in the list (OSV itself was the first), so existing symbols were always found before reaching the partially-set-up object. Now OSV, with many symbols, is last, and the half-set-up object is in the middle, so the problem is common. But it also could happen previously, if we had unresolved symbols (e.g., weak symbols), but these were probably rare enough for the bug not to happen in practice. The fix in this patch is "hacky", because I wanted to avoid restructuring the whole code. The problem is that the functions called in add_object() (including relocate_rela(), nested add_object(), etc.) all assume that they can look up symbols in the being-set-up object, while we don't want these objects to be visible for other threads. So we do exactly this - each object gets a "visiblity" field. If "visibility" is 0, all threads can use this object, but if visibility is a thread pointer, only this thread searches in this object. So add_object() starts the object with visibility set to its thread, and only when add_object() is done, it sets the visibility to 0 so all threads can see it. While this solves the common bug, not that this patch still leaves a small room for SMP bugs, because it doesn't add locking to _modules, so a lookup during an add_object() can see a broken vector for a short duration. We should fix this remaining problem later, using RCU.
-
Nadav Har'El authored
This patch fixes the exception handling bug seen in tst-except.so. The callback given dl_iterate_phdr (such as _Unwind_IteratePhdrCallback in libgcc_eh.a used to implement exceptions) may decide to cache previous information we gave it, as long as the "adds" and "subs" fields are unchanged. The bug was that we passed to the callback a on-stack *copy* of the obj->_phdrs vector, and if the callback saved pointers to that in its cache, they became invalid on the next call. We need the pointers to remain valid as long as adds/subs do not change. So we need to pass the actual obj->_phdrs (which doesn't change after the object's load), NOT a copy. Note there's a locking issue remaining here - if someone dlclose()s an object while the callback is running (and already checked adds/subs) it can use a stale pointer. This should be fixed separately, probably by using reference counting on objects.
-
Nadav Har'El authored
The callback function passed to dl_iterate_phdr, such as _UnWind_IteratePhdrCallback (used in libgcc_eh.a to implement exceptions), may want to cache previous lookups, and wants to know when the list of iterated modules hasn't changed since the last call to dl_iterate_phdr. For this, dl_iterate_phdr() is supposed to fill two fields, dlpi_adds and dlpi_subs, counting the number of times objects were loaded or unloaded from the program. If both dlpi_subs and dlpi_adds are unchanged, the callback is guaranteed the list of objects is unchanged. In the existing code, we forgot to set these two fields, so they got random values which caused the exception unwinding code to sometimes cache, and sometime not cache, depending on the phase of the moon. This patch adds the counting of the correct "subs" and "adds" counters, and after it exception unwinding will always use its cache (as long as the list of objects doesn't change). Note that this does NOT fix the crash in tst-except.so. That is a bug which appears when caching is enabled (which before this patch happend randomly), and will be fixed by the next patch.
-
Glauber Costa authored
Our console write() takes 3 parameters. The last one controls whether or not we will issue a newline at the end of input. If it is true, we will call the console's implementation of newline(). It is always passed as false, though. Remove it and fix the callers.
-
- Aug 07, 2013
-
-
Pekka Enberg authored
-
Pekka Enberg authored
-
- Aug 06, 2013
-
-
Nadav Har'El authored
The option to undef LOCKFREE_MUTEX in osv/mutex.h, to enable the old spin-based mutex, got broken after the addition of the wait morphing feature. Wait morphing needs two features which were never added to the spin-based option, and probably never will (we should in the near future remove the old mutex implementation): 1. The mutex->send_lock() feature, on which the wait morphing feature is built, and 2. The ability to add another field, "user_mutex", to condvar, which means condvar cannot be constraint in size so libc/pthread.cc must use a pointer, which we only currently do for the larger LOCKFREE_MUTEX. So this patch re-adds the "WAIT_MORPHING" compile-time option, and disables WAIT_MORPHING if LOCKFREE_MUTEX is disabled. Now OSV can be compiled with #undef LOCKFREE_MUTEX.
-
Pekka Enberg authored
Make calloc() deal with malloc() returning NULL.
-
Pekka Enberg authored
-
- Aug 05, 2013
-
-
Nadav Har'El authored
This patch fixes the bug in tst-resolve.so, where an OSV symbol (such as debug()) hides a symbol in the application (a shared object we are running). We look up symbols in load order - if tst-resolve.so needs libstdc++.so, we search for symbols in this order - first in tst-resolve.so and then libstdc++.so. This order is mostly fine. There is one problem though - that OSV itself is loaded first, so always gets searched first, which is not really what users expect: Users expect OSV to behave like the glibc library (searched last), not like the main executable (search first). So this patch keeps the first-loaded object (OSV itself) last on the search list.
-
Avi Kivity authored
Increasing the poll time increases the chances that we can avoid the IPI. Idle host load increase is negligible.
-
Avi Kivity authored
The scheduler polls the wakeup queue when idle for a short time before HLTing in order to avoid the expensive HLT instruction if a wakeup arrives early. This patch extends this to also disable remote wakeups during the polling period, reducing the waking cpu's need to issue an IPI, whicj requires an exit. This helps synchronous multithreaded workloads, where threads block and wake each other. Together with the following patch, netperf throughtput increases from ~17Gbps to ~19Gbps, and the context switch benchmark improves from $ run tests/tst-ctxsw.so 345 colocated 5761 apart 633 nopin to $ run tests/tst-ctxsw.so 347 colocated 598 apart 467 nopin
-
- Aug 04, 2013
-
-
Nadav Har'El authored
Convert poll.c to poll.cc, and add a few tracepoints.
-