- Dec 15, 2013
-
-
Nadav Har'El authored
thread::destroy() had a "FIXME" comment: // FIXME: we have a problem in case of a race between join() and the // thread's completion. Here we can see _joiner==0 and not notify // anyone, but at the same time join() decided to go to sleep (because // status is not yet status::terminated) and we'll never wake it. This is indeed a bug, which Glauber discovered was hanging the tst-threadcomplete.so test once in a while - the test sometimes hangs with one thread in the "terminated" state (waiting for someone to join it), and a second thread waiting in join() but missed the other thread's termination event. The solution works like this: join() uses a CAS to set itself as the _joiner. If it succeeded, it waits like before for the status to become "terminated". But if the CAS failed, it means a concurrent destroy() call beat us at the race, and we can just return from join(). destroy() checks (with a CAS) if _joiner was already set - if so we need to wake this thread just like in the original code. But if _joiner was not yet set, either there is no-one doing join(), or there's a concurrent join() call that will soon return (this is what the joiner does when it loses the CAS race). In this case, all we need to do is to set the status to "terminated" - and we must do it through a _detached_state we saved earlier, because if join() already returned the thread may already be deleted). Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Nadav Har'El authored
wake_with(action) was implemented using thread_handle, as the following: thread_handle h(handle()); action(); h.wake(); This implementation is wrong: It only takes the RCU lock (which prevents the destruction of _detached_state) during h.wake(), meaning that if the thread is not sleeping, and action() causes it to exit, _detached_state may also be destructed, and h.wake() will crash. thread_handle is simply not needed for wake_with(), and was designed with a completely different use case in mind (long-term holding of a thread handler). We just need to use, in-line, the appropriate rcu lock which keeps _detached_state alive. The resulting code is even simpler, and nicely parallels the existing code of wake(). This patch fixes a real bug, but unfortunately we don't have a concrete test-case which it is known to fix. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Nadav Har'El authored
Add a new lock, "rcu_read_lock_in_preempt_disabled", which is exactly like rcu_read_lock but assuming that preemption is already disabled. Because all our rcu_read_lock does is to disable preemption, the new lock type currently does absolutely nothing - but in some future implementation of RCU it might need to do something. We'll use the new lock type in the following patch, as an optimization over the regular rcu_read_lock. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Glauber Costa authored
Context: going to wait with irqs_disabled is a call for disaster. While it is true that not every time we call wait we actually end up waiting, that should be an invalid call, due to the times we may wait. Because of that, it would be good to express that nonsense in an assertion. There is however, places we sleep with irqs disabled currently. Although they are technically safe, because we implicitly enable interrupts, they end up reaching wait() in a non-safe state. That happens in the page fault handler. Explicitly enabling interrupts will allow us to test for valid / invalid wait status. With this test applied, all tests in our whitelist still passes. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
- Dec 13, 2013
-
-
Raphael S. Carvalho authored
Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 12, 2013
-
-
Avi Kivity authored
Fix a unicode character in the author's name. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Pekka Enberg authored
Make sure that the address range passed to munmap() is actually mapped. Reviewed-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Simplify mmap() by converting flags and permissions in one place. Reviewed-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Move mincore() to libc/mman.cc where all other memory mapping libc functions are. Reviewed-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Add a mmu::is_page_aligned() helper function and use it to get rid of open-coded checks. Reviewed-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Vlad Zolotarov authored
- It's compiled out when mode=release. - Uses an assert() for issuing the assert. - Has a printf-like semantics. Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Vlad Zolotarov authored
- Add -DNDEBUG to the compiler flags when mode!=debug. - Prevent assert() from compiling out in kernel when mode=release Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Vlad authored
Fix some compilation errors that would arrise when NDEBUG is defined: - tests/misc-tcp.cc: missing #include<iostream> that would come from include/boost/assert.hpp: line 81 when NDEBUG is not defined. - bsd/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c: when NDEBUG is not defined assert() is defined to __assert_fail, which has "noreturn" attribute, which satisfies the compiler. When NDEBUG is defined and assert() completly compiles out, the compiler start complaining about the missing "return" statment. Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Vlad authored
Add the missing #infdef X #define X protection to include/api/assert.h Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Tomasz Grabiec authored
This patch adds sorting as an option (-s). Examples: Not sorted: gdb$ osv trace duration Sorted, narrowed down to one function: gdb$ osv trace duration -s vfs_pwritev Not sorting allows us to start printing traces right away. There's also no need to keep them in memory, which makes the command a bit faster. Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Make blacklisted tests visible during test.py run to attract attention to them and hopefully get them fixed: TEST tst-strerror_r.so OK (1.083 s) TEST tst-threadcomplete.so SKIPPED TEST tst-tracepoint.so OK (1.120 s) Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
The test hangs at times and while there's a fix brewing, it's making 'make check' less useful. Lets add it back when it works all the time. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Juan Antonio Osorio authored
Previously, the variables image_file, cmd_args and opt_path were used globally along the run.py script, I did not consider this convenient and thus, opted to convert these global variables, into one variable that is passed as a parameter. This variables was renamed to options, with the mindset that, if desired, this variable could come from a configuration file, thus making the passing of command line arguments optional. So, if this functionality for a configuration file... Or perhaps default values, be added to the script, there would not be much need for refactoring (or renaming) as there would with these global variables. Reviewed-by:
Tomasz Grabiec <tgrabiec@gmail.com> Signed-off-by:
Juan Antonio Osorio Robles <jaosorior@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Tomasz Grabiec authored
The '-nographic' option is deprecated and no longer imposes the behavior which we want since this commit: qemu.git 02c4bdf1d2ca8c02a9bae16398f260b5c08d08bf We should use 'signal' option of the chardev instead, which works with current qemu master as well as on 1.4.0 This patch renames '-g' option to '-s' as the former no longer has adequate name. Reported-by:
Juan Antonio Osorio Robles <jaosorior@gmail.com> Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
After commit dc40b49e, the place of the _cpu and _status fields of a thread has moved (into a structure pointed by the _detached_thread field). So fix loader.py to use these new locations, so "osv info threads" will work again. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 11, 2013
-
-
Pekka Enberg authored
There's no reason to run a locale test as part of OSv boot sequence... Avi writes: It used to be a major stumbling block but is of no interest now. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Simplify core/mmu.cc and make it more portable by moving the page fault handler to arch/x64/mmu.cc. There's more arch specific code in core/mmu.cc that should be also moved. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Make vma constructors more strongly typed by using the addr_range type. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Separate the common vma code to an abstract base class that's inherited by anon_vma and file_vma. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
Add images/memcached.py, so that "make image=memcached" would work. To use this, you'll also need to check out a recent version of the osv-apps repository (in apps directory, check out the master branch, and git pull). Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Amnon Heiman authored
Separate /dev/random the virtio-rng driver and register virtio-rng as a HW RNG entropy source. Signed-off-by:
Amnon Heiman <amnon@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Glauber Costa authored
We have recently seen a problems where eventual page fault outside application would occur. I managed to track that down to my huge page failure patch, but wasn't really sure what was going on. Kudos for Raphael, then, that figured out that the problem happened when allocate_intemediate_level was called from split_huge_page. The problem here, is that in that case we do *not* enter allocate_intermediate_level with the pte emptied, and were previously expecting the write of the new pte to happen unconditionally. The compare_exchange broke it, because the exchange doesn't really happen. There are many ways to fix this issue, but the least confusing of them, given that there are other callers to this function that could potentially display this problem, is to do some deffensive programming and clearly separate the semantics of both types of callers. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Tested-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
Once page_fault() checks that this is not a fast fixup (see safe_load()), we reach the page-fault slow path, which needs to allocate memory or even read from disk, and might sleep. If we ever get such a slow page-fault inside kernel code which has preemption or interrupts disabled, this is a serious bug, because the code in question thinks it cannot sleep. So this patch adds two assertions to verify this. The preemptable() assertion is easily triggered if stacks are demand-paged as explained in commit 41efdc1c (I have a patch to solve this, but it won't fit in the margin). However, I've also seen this assertion without demand-paged stacks, when running all tests together through testrunner.so. So I'm hoping these assertions will be helpful in hunting down some elusive bugs we still have. This patch adds a third use of the "0x200" constant (the nineth bit of the rflags register is the interrupt flag), so it replaces them by a new symbolic name, processor::rflags_if. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Glauber Costa authored
We suddenly stop propagating the exception frame down the vma_fault path. There is no reason not to propagate it further, aside from the fact that currently there are no users. However, aside from the fact that it presents a more consistent frame passing, I intend to use it for the JVM balloon. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
Rename blacklisted tests, from tst-wake.cc et al. to misc-wake.cc. The different name will cause these tests not to be automatically run by "make check" - without needing the separate blacklist in test.py (which this patch deletes). After this patch, testrunner.so will also only run tests called tst-*, so will not run the misc-* tests. The misc-* tests can still be run manually, e.g., run.py -e tests/misc-mutex.so In addition to the previously blacklisted tests, this patch "blacklists" (renames) a few additional tests which fail quickly, but test.py didn't know because they didn't use the word "fail". An example is tst-schedule.so, which existed immediately when not run on 1 vcpu. So this patch also renames it to misc-schedule.so, so "make check" or testrunner.so won't run this test. Note that after this patch, testrunner.so is a new way to run all tests, but it isn't working well yet because it still exposes new bugs that do not exist in the separate tests (depending on your view point, this might be considered a feature, not a bug, in testrunner.so...). Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
This reduces unnecessary interrupts that host could send to guest while guest is in the progress of irq handling. In virtio_driver::wait_for_queue, we will re-enable interrupts when there is nothing to process. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Using hard-coded path names is problematic because other test cases may use the same path names and forget to clean up after them. Make tst-fs-link.so more robust by using mktemp() to generate unique path names. Reviewed-by:
Tomasz Grabiec <tgrabiec@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Glauber Costa authored
The last part of the standard thread tests created 4 threads and calls the detach of one from the body of the other. They live in the same block to guarantee that they will all be destroyed more or less at the same time (we expect). Avi, however, demonstrated that a mistake prevents that from being the actual case: t1 starts to run t2 starts to run t3 starts to run t4 starts to run t4 is detached t4 is destroyed (ok) t3 is destroyed. wasn't detached or join, to terminate t1, t2, t3 are detached, but too late This introduces a simple wait mechanism to avoid having the threads terminated after the block is gone. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 10, 2013
-
-
Nadav Har'El authored
This patch fixes two bugs in shared-object finalization, i.e., running its static destructors before it is unloaded. The bugs were seen when osv::run()ing a test program using libboost_unit_test_framework-mt.so, which crashed after the test program finished. The two related bugs were: 1. We need to call the module's destructors (run_fini_funcs()) *before* removing it from the module list, otherwise the destructors will not be able to call functions from this module! (we got a symbol not found error in the destructor). 2. We need to unload the modules needed by this module *before* unloading this module, not after like was (implictly) done until now. This makes sense because of symmetry (during a module load, the needed modules are loaded after this one), but also practically: a needed module's destructor (in our case, boost unit test framework) might refer to objects in the needing module (in our case, the test program), so we cannot call the needed module's destructor after we've already unloaded the needing module. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Juan Antonio Osorio authored
Signed-off-by:
Juan Antonio Osorio Robles <jaosorior@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Add a '--repeat' option to test.py that repeats the test suite until a test fails. This is useful for detecting test cases that fail some of the time. Reviewed-by:
Tomasz Grabiec <tgrabiec@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Make the test runner output look pretty and show test duration to make it visible which tests take the longest time to run. The output looks as follows now: TEST tst-af-local.so OK (3.288 s) TEST tst-bdev-write.so OK (1.058 s) TEST tst-bsd-evh.so OK (1.071 s) TEST tst-bsd-kthread.so OK (1.234 s) TEST tst-bsd-taskqueue.so OK (1.062 s) TEST tst-bsd-tcp1.so OK (2.114 s) TEST tst-commands.so OK (1.141 s) TEST tst-condvar.so OK (1.776 s) TEST tst-dns-resolver.so OK (2.560 s) TEST tst-epoll.so OK (1.952 s) TEST tst-except.so OK (1.146 s) TEST tst-fpu.so OK (2.630 s) TEST tst-fs-link.so OK (1.051 s) TEST tst-fs-stress.so OK (1.027 s) TEST tst-fsx.so OK (1.067 s) TEST tst-hub.so OK (6.256 s) TEST tst-huge.so OK (2.199 s) TEST tst-kill.so OK (4.147 s) TEST tst-libc-locking.so OK (2.110 s) TEST tst-loadbalance.so OK (1.070 s) TEST tst-mmap-file.so OK (1.080 s) TEST tst-mmap.so OK (1.087 s) TEST tst-pipe.so OK (7.306 s) TEST tst-preempt.so OK (1.119 s) TEST tst-pthread.so OK (1.100 s) TEST tst-queue-mpsc.so OK (3.748 s) TEST tst-ramdisk.so OK (1.078 s) TEST tst-readdir.so OK (1.094 s) TEST tst-remove.so OK (1.030 s) TEST tst-rename.so OK (1.157 s) TEST tst-resolve.so OK (1.095 s) TEST tst-scheduler.so OK (1.087 s) TEST tst-sleep.so OK (3.083 s) TEST tst-solaris-taskq.so OK (1.061 s) TEST tst-stat.so OK (1.106 s) TEST tst-strerror_r.so OK (1.102 s) TEST tst-tcp-sendonly.so OK (2.014 s) TEST tst-tcp.so OK (1.080 s) TEST tst-threadcomplete.so OK (2.770 s) TEST tst-tracepoint.so OK (1.109 s) TEST tst-truncate.so OK (1.083 s) TEST tst-utimes.so OK (1.079 s) TEST tst-vblk.so OK (1.310 s) TEST tst-vfs.so OK (1.118 s) TEST tst-yield.so OK (1.992 s) TEST tst-zfs-mount.so OK (1.087 s) OK (58 tests run, 82.944 s) Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-