- Dec 30, 2013
-
-
Gleb Natapov authored
mprotect(PROT_WRITE) on a file opened as read only should fail, but current mprotect() implementation is missing the check. The patch implements it. Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 26, 2013
-
-
Gleb Natapov authored
Add constexpr to make sure they are evaluated in compile time if possible. Compiler will probably do it anyway though. Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
- Dec 24, 2013
-
-
Avi Kivity authored
Helps making bsd header changes that xen includes. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Nadav Har'El authored
We use sched::thread::attr to pass parameters to sched::thread creation, i.e., create a thread with non-default stack parameters, pinned to a particular CPU, or a detached thread. Previously we had constructors taking many combinations of stack size (integer), pinned cpu (cpu*) and detached (boolean), and doing "the right thing". However, this makes the code hard to read (what does attr(4096) specify?) and the constructors hard to expand with new parameters. Replace the attr() constructors with the so-called "named parameter" idiom: attr now only has a null constructor attr(), and one modifies it with calls to pin(cpu*), detach(), or stack(size). For example, attr() // default attributes attr().pin(sched::cpus[0]) // pin to cpu 0 attr().stack(4096).pin(sched::cpus[0]) // pin and non-default stack and so on. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
- Dec 20, 2013
-
-
Nadav Har'El authored
Our sched::thread makes it rather difficult to create threads with non-default attributes. This patch makes it easier to create a thread with a non-default stack size, e.g., a light thread with a one-page stack: sched::thread a([&] { func(); }, sched::thread::attr(4096)) We should probably overhaul the sched::thread constructors at some point to make it easier to specify options, but for now, this specific constructor is convenient for my uses. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 19, 2013
-
-
Avi Kivity authored
There is no need for release memory ordering when assigning a null pointer to an rcu pointer, since the null pointer cannot be dereferenced. Add a specialization of assign() that takes advantage of this fact. Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Gleb Natapov authored
mprotect() should fails with ENOMEM if it is called on non mapped virtual address, but this check is done by mmu::ismapped(). Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 18, 2013
-
-
Glauber Costa authored
This patch adds the basic of memory tracking, and exposes an interface to for that data to be collected. We basically start with all stats at zero, and as we add memory to the System, we bump it up and recalculate the watermarks (to avoid recomputing them all the time). When a page range comes up, it will be added as free memory. We operate based on what is currently sitting in the page ranges. This means that we are effectively ignoring memory that sit in pools for memory usage. I think it is a good assumption because it allow us to focus in the big picture, and leave the pools to be used as liquid currency. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
bio with BIO_SCSI flag contains a SCSI command. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
This can be used by the low-level driver to store private data, e.g. virtio-scsi driver uses it to store a virtio-scsi request bound to this bio. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 17, 2013
-
-
Avi Kivity authored
With net channels, poll() needs to wait not only on poll wakeups and the timeout, but also requests from network interfaces to flush net channels for polled sockets. In preparation for that, switch from bsd msleep() to native wait_until(). Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Raphael S. Carvalho authored
Reviewed-by:
Dor Laor <dor@cloudius-systems.com> Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 16, 2013
-
-
Tomasz Grabiec authored
This code seems obviously broken to me: tls_data().size there is no tls_data function, it is a struct. So this is creating temporary uninitialized struct and reads size field from it. What it meant instead is probably the TLS size, which is calculated by tls() function and returned in a tls_data structure. I am not able to actually test this change because I don't have any DSO which has R_X86_64_TPOFF64 relocations. Any idea how to test it? tls() is also broken, because it initializes file_size field instead of the size field. The file_size field was added at some point but this place wasn't updated. As it appears that tls() is not actually used anywhere, this patch gets rid of it. Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Tomasz Grabiec authored
Dynamically loaded modules use __tls_get_addr() to locate thread local symbols. Symbol is identified by module index and offset in module's TLS area. Module index and offset are filled in by dynamic linker when DSO is loaded. TLS area for given DSO is allocated dynamically, on-demand. OSv keeps TLS areas in a vector indexed by module index, inside per-thread vector. TLS area of core module should be handled differently than that of dynamically loaded modules. The TLS offsets for thread local symbols defined in core module are known at link time and code inside core module can use these offsets directly. The offsets are relative to TCB pointer (fs register on x86). The problem was that __tls_get_addr() was treating core module as a dynamically loaded module and returned pointer inside dynamically allocated TLS area instead of a pointer inside core module's TLS. As a result code inside core module was reading value from different location than code inside DSO has written value to. The offending thread local varaible was __once_call. It was set by call_once() defined in DSO (inlined from a definition inside header) and read by __once_proxy() defined in core module. Fixes #125. Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Tomasz Grabiec authored
In order to have unform naming, ulong is used in several places. Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Avi Kivity authored
Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
The bsd specific ones are nenumbered to avoid clashes. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Since most SOL_* macros are equivalent to the IPPROTO_* defines, only one define needs to be changed. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Move them to a common include file. Since they're defined externally, there is no real conflict. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 15, 2013
-
-
Nadav Har'El authored
thread::destroy() had a "FIXME" comment: // FIXME: we have a problem in case of a race between join() and the // thread's completion. Here we can see _joiner==0 and not notify // anyone, but at the same time join() decided to go to sleep (because // status is not yet status::terminated) and we'll never wake it. This is indeed a bug, which Glauber discovered was hanging the tst-threadcomplete.so test once in a while - the test sometimes hangs with one thread in the "terminated" state (waiting for someone to join it), and a second thread waiting in join() but missed the other thread's termination event. The solution works like this: join() uses a CAS to set itself as the _joiner. If it succeeded, it waits like before for the status to become "terminated". But if the CAS failed, it means a concurrent destroy() call beat us at the race, and we can just return from join(). destroy() checks (with a CAS) if _joiner was already set - if so we need to wake this thread just like in the original code. But if _joiner was not yet set, either there is no-one doing join(), or there's a concurrent join() call that will soon return (this is what the joiner does when it loses the CAS race). In this case, all we need to do is to set the status to "terminated" - and we must do it through a _detached_state we saved earlier, because if join() already returned the thread may already be deleted). Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Nadav Har'El authored
wake_with(action) was implemented using thread_handle, as the following: thread_handle h(handle()); action(); h.wake(); This implementation is wrong: It only takes the RCU lock (which prevents the destruction of _detached_state) during h.wake(), meaning that if the thread is not sleeping, and action() causes it to exit, _detached_state may also be destructed, and h.wake() will crash. thread_handle is simply not needed for wake_with(), and was designed with a completely different use case in mind (long-term holding of a thread handler). We just need to use, in-line, the appropriate rcu lock which keeps _detached_state alive. The resulting code is even simpler, and nicely parallels the existing code of wake(). This patch fixes a real bug, but unfortunately we don't have a concrete test-case which it is known to fix. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Nadav Har'El authored
Add a new lock, "rcu_read_lock_in_preempt_disabled", which is exactly like rcu_read_lock but assuming that preemption is already disabled. Because all our rcu_read_lock does is to disable preemption, the new lock type currently does absolutely nothing - but in some future implementation of RCU it might need to do something. We'll use the new lock type in the following patch, as an optimization over the regular rcu_read_lock. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Glauber Costa authored
Context: going to wait with irqs_disabled is a call for disaster. While it is true that not every time we call wait we actually end up waiting, that should be an invalid call, due to the times we may wait. Because of that, it would be good to express that nonsense in an assertion. There is however, places we sleep with irqs disabled currently. Although they are technically safe, because we implicitly enable interrupts, they end up reaching wait() in a non-safe state. That happens in the page fault handler. Explicitly enabling interrupts will allow us to test for valid / invalid wait status. With this test applied, all tests in our whitelist still passes. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
- Dec 13, 2013
-
-
Raphael S. Carvalho authored
Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 12, 2013
-
-
Pekka Enberg authored
Add a mmu::is_page_aligned() helper function and use it to get rid of open-coded checks. Reviewed-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Vlad Zolotarov authored
- It's compiled out when mode=release. - Uses an assert() for issuing the assert. - Has a printf-like semantics. Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Vlad Zolotarov authored
- Add -DNDEBUG to the compiler flags when mode!=debug. - Prevent assert() from compiling out in kernel when mode=release Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Vlad authored
Add the missing #infdef X #define X protection to include/api/assert.h Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 11, 2013
-
-
Pekka Enberg authored
Simplify core/mmu.cc and make it more portable by moving the page fault handler to arch/x64/mmu.cc. There's more arch specific code in core/mmu.cc that should be also moved. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Make vma constructors more strongly typed by using the addr_range type. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Separate the common vma code to an abstract base class that's inherited by anon_vma and file_vma. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Glauber Costa authored
We suddenly stop propagating the exception frame down the vma_fault path. There is no reason not to propagate it further, aside from the fact that currently there are no users. However, aside from the fact that it presents a more consistent frame passing, I intend to use it for the JVM balloon. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 10, 2013
-
-
Nadav Har'El authored
This patch fixes two bugs in shared-object finalization, i.e., running its static destructors before it is unloaded. The bugs were seen when osv::run()ing a test program using libboost_unit_test_framework-mt.so, which crashed after the test program finished. The two related bugs were: 1. We need to call the module's destructors (run_fini_funcs()) *before* removing it from the module list, otherwise the destructors will not be able to call functions from this module! (we got a symbol not found error in the destructor). 2. We need to unload the modules needed by this module *before* unloading this module, not after like was (implictly) done until now. This makes sense because of symmetry (during a module load, the needed modules are loaded after this one), but also practically: a needed module's destructor (in our case, boost unit test framework) might refer to objects in the needing module (in our case, the test program), so we cannot call the needed module's destructor after we've already unloaded the needing module. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
Currently, namei() does vget() unconditionally if no dentry is found. This is wrong because the path can be a hard link that points to a vnode that's already in memory. To fix the problem: - Use inode number as part of the hash in vget() - Use vn_lookup() in vget() to make sure we have one vnode in memory per inode number. - Push the vget() calls down to individual filesystems and make VOP_LOOKUP return an vnode Changes since v2: - v1 dropped lock in vn_lookup, thus assert that vnode_lock is held. Changes since v3: - Fix lock ordering issue in dentry_lookup. The lock respective to the parent node must be acquired before dentry_lookup and released after the process is done. Otherwise, a second thread looking up for the same dentry may take the 'NULL' path incorrectly. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com> Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Avi Kivity authored
As ref() is now never called, we can remove the reference counter and make unref() unconditional. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
One problem with wake() is, if the thread that it is waking can cuncurrently exit, that it may touch freed memory belonging to the thread structure. Fix by separating the state that wake() touches into a detached_state structure, and free that using rcu. Add a thread_handle class that references only this detached state, and accesses it via rcu. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Makes it much easier to use. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-