Commits · 6dcaca1db76eacd813b6ef45ced6d14c3536bba2 · Verlässliche Systemsoftware / projects / osv

Dec 16, 2013

net: reconcile api and bsd INADDR_* macros · 6dcaca1d

Avi Kivity authored 11 years ago


Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

6dcaca1d

net: reconcile bsd and linux IP_ socket options · 84b96e0f

Avi Kivity authored 11 years ago


The bsd specific ones are nenumbered to avoid clashes.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

84b96e0f

net: reconcile in.h IN_CLASS* macros and friends · 8fa97c1b

Avi Kivity authored 11 years ago


Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

8fa97c1b

net: reconcile bsd and api SOL_* macros · e9c888e1

Avi Kivity authored 11 years ago


Since most SOL_* macros are equivalent to the IPPROTO_* defines, only one
define needs to be changed.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

e9c888e1

net: reconcile bsd and api IPPROTO_* macros · 4c5a10d5

Avi Kivity authored 11 years ago


Move them to a common include file.

Since they're defined externally, there is no real conflict.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

4c5a10d5

Dec 15, 2013

Fix race between join() and thread completion · 649654af

Nadav Har'El authored 11 years ago


thread::destroy() had a "FIXME" comment:
// FIXME: we have a problem in case of a race between join() and the
// thread's completion. Here we can see _joiner==0 and not notify
// anyone, but at the same time join() decided to go to sleep (because
// status is not yet status::terminated) and we'll never wake it.

This is indeed a bug, which Glauber discovered was hanging the
tst-threadcomplete.so test once in a while - the test sometimes hangs
with one thread in the "terminated" state (waiting for someone to join
it), and a second thread waiting in join() but missed the other thread's
termination event.

The solution works like this:

join() uses a CAS to set itself as the _joiner. If it succeeded, it
waits like before for the status to become "terminated". But if the CAS
failed, it means a concurrent destroy() call beat us at the race, and we
can just return from join().

destroy() checks (with a CAS) if _joiner was already set - if so we need
to wake this thread just like in the original code. But if _joiner was
not yet set, either there is no-one doing join(), or there's a concurrent
join() call that will soon return (this is what the joiner does when it
loses the CAS race). In this case, all we need to do is to set the status
to "terminated" - and we must do it through a _detached_state we saved
earlier, because if join() already returned the thread may already be
deleted).

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

649654af

Fix wake_with() · a6bbd0e7

Nadav Har'El authored 11 years ago


wake_with(action) was implemented using thread_handle, as the following:

thread_handle h(handle());
action();
h.wake();

This implementation is wrong: It only takes the RCU lock (which prevents
the destruction of _detached_state) during h.wake(), meaning that if the
thread is not sleeping, and action() causes it to exit, _detached_state
may also be destructed, and h.wake() will crash.

thread_handle is simply not needed for wake_with(), and was designed
with a completely different use case in mind (long-term holding of a
thread handler). We just need to use, in-line, the appropriate rcu
lock which keeps _detached_state alive. The resulting code is even
simpler, and nicely parallels the existing code of wake().

This patch fixes a real bug, but unfortunately we don't have a concrete
test-case which it is known to fix.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

a6bbd0e7

Add rcu_lock_in_preempt_type · 9f0e1287

Nadav Har'El authored 11 years ago


Add a new lock, "rcu_read_lock_in_preempt_disabled", which is exactly
like rcu_read_lock but assuming that preemption is already disabled.
Because all our rcu_read_lock does is to disable preemption, the new
lock type currently does absolutely nothing - but in some future
implementation of RCU it might need to do something.

We'll use the new lock type in the following patch, as an optimization
over the regular rcu_read_lock.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

9f0e1287

enable interrupts during page fault handling · ec7ed8cd

Glauber Costa authored 11 years ago

Context: going to wait with irqs_disabled is a call for disaster. While it is
true that not every time we call wait we actually end up waiting, that should
be an invalid call, due to the times we may wait. Because of that, it would
be good to express that nonsense in an assertion.

There is however, places we sleep with irqs disabled currently. Although they
are technically safe, because we implicitly enable interrupts, they end up
reaching wait() in a non-safe state. That happens in the page fault handler.
Explicitly enabling interrupts will allow us to test for valid / invalid wait
status.

With this test applied, all tests in our whitelist still passes.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

ec7ed8cd

Dec 13, 2013

mmu: Add page_size_shift constant to avoid magic values · fd2e7bed

Raphael S. Carvalho authored 11 years ago


Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

fd2e7bed

Dec 12, 2013

mmu: Add is_page_aligned() helper function · 71f1ffda

Pekka Enberg authored 11 years ago


Add a mmu::is_page_aligned() helper function and use it to get rid of
open-coded checks.

Reviewed-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

71f1ffda

Added DEBUG_ASSERT() macro · b40cb27a

Vlad Zolotarov authored 11 years ago


 - It's compiled out when mode=release.
 - Uses an assert() for issuing the assert.
 - Has a printf-like semantics.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

b40cb27a

build.mk: Add -DNDEBUG when mode!=debug · 5e55a29b

Vlad Zolotarov authored 11 years ago


 - Add -DNDEBUG to the compiler flags when mode!=debug.
 - Prevent assert() from compiling out in kernel when mode=release

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

5e55a29b

include/api/assert.h: Add the missing protection · 7ea87b98

Vlad authored 11 years ago

Add the missing #infdef X #define X protection to include/api/assert.h

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

7ea87b98

Dec 11, 2013

x64: Make page fault handler arch specific · 43491705

Pekka Enberg authored 11 years ago


Simplify core/mmu.cc and make it more portable by moving the page fault
handler to arch/x64/mmu.cc.  There's more arch specific code in
core/mmu.cc that should be also moved.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

43491705

mmu: Use addr_range for vma constructors · bbec1a18

Pekka Enberg authored 11 years ago


Make vma constructors more strongly typed by using the addr_range type.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

bbec1a18

core: vma abstract base class · d83db0c9

Pekka Enberg authored 11 years ago


Separate the common vma code to an abstract base class that's inherited
by anon_vma and file_vma.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

d83db0c9

vma_fault: propagate exception frame to fault handlers · 7ab5f9e8

Glauber Costa authored 11 years ago

We suddenly stop propagating the exception frame down the vma_fault path.
There is no reason not to propagate it further, aside from the fact that
currently there are no users. However, aside from the fact that it presents a
more consistent frame passing, I intend to use it for the JVM balloon.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

7ab5f9e8

Dec 10, 2013

Fix shared-object finalization · 4d24b90a

Nadav Har'El authored 11 years ago


This patch fixes two bugs in shared-object finalization, i.e., running
its static destructors before it is unloaded. The bugs were seen when
osv::run()ing a test program using libboost_unit_test_framework-mt.so,
which crashed after the test program finished.

The two related bugs were:

1. We need to call the module's destructors (run_fini_funcs()) *before*
   removing it from the module list, otherwise the destructors will not
   be able to call functions from this module! (we got a symbol not
   found error in the destructor).

2. We need to unload the modules needed by this module *before* unloading
   this module, not after like was (implictly) done until now.
   This makes sense because of symmetry (during a module load, the needed
   modules are loaded after this one), but also practically: a needed
   module's destructor (in our case, boost unit test framework) might refer
   to objects in the needing module (in our case, the test program),
   so we cannot call the needed module's destructor after we've already
   unloaded the needing module.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

4d24b90a

vfs: Fix duplicate in-memory vnodes · 9ecda822

Raphael S. Carvalho authored 11 years ago


Currently, namei() does vget() unconditionally if no dentry is found.
This is wrong because the path can be a hard link that points to a vnode
that's already in memory.

To fix the problem:

  - Use inode number as part of the hash in vget()

  - Use vn_lookup() in vget() to make sure we have one vnode in memory
    per inode number.

  - Push the vget() calls down to individual filesystems and make
    VOP_LOOKUP return an vnode

Changes since v2:
  - v1 dropped lock in vn_lookup, thus assert that vnode_lock is held.

Changes since v3:
  - Fix lock ordering issue in dentry_lookup. The lock respective to the parent
node must be acquired before dentry_lookup and released after the process is
done. Otherwise, a second thread looking up for the same dentry may take the
'NULL' path incorrectly.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

9ecda822

sched: remove thread::_ref_counter · 9c9262b0

Avi Kivity authored 11 years ago


As ref() is now never called, we can remove the reference counter and
make unref() unconditional.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

9c9262b0

sched: change wake_with() to use rcu locking · 3dec9895

Avi Kivity authored 11 years ago


Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

3dec9895

sched: add a wake() function that is safe to use on a thread that may terminate · dc40b49e

Avi Kivity authored 11 years ago


One problem with wake() is, if the thread that it is waking can cuncurrently
exit, that it may touch freed memory belonging to the thread structure.

Fix by separating the state that wake() touches into a detached_state
structure, and free that using rcu.

Add a thread_handle class that references only this detached state, and
accesses it via rcu.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

dc40b49e

rcu: make rcu_ptr default initialize to a reasonable value · a828b340

Avi Kivity authored 11 years ago


Makes it much easier to use.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

a828b340

rcu: add preempt_lock_in_rcu · 18cee681

Avi Kivity authored 11 years ago


rcu_read_lock disables preemption, but this is an implementation detail
and users should not make use of it.

Add preempt_lock_in_rcu that takes advantage of the implementation detail
and does nothing, but allows users to explicitly disable preemption.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

18cee681

rcu: forward declare preempt_enable() to avoid #include hell · dff28306

Avi Kivity authored 11 years ago


Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

dff28306

mmu: support MAP_UNINITIALIZED flag · f7249e73

Glauber Costa authored 11 years ago


When seeing this flag, pages fault in should not be filled with zeroes or any
other patterns, and should rather be just left alone in whatever state we find
them at.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

f7249e73

Dec 09, 2013

Revert "vfs: Fix duplicate in-memory vnodes" · 0984e12e

Pekka Enberg authored 11 years ago


This reverts commit e4aad1ba.

It causes tst-vfs.so to just hang.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

0984e12e

Dec 08, 2013

sched: implement pthread_detach · afcf4735

Glauber Costa authored 11 years ago


I needed to call detach in a test code of mine, and this is isn't implemented.
The code I wrote to use it may or may not stay in the end, but nevertheless,
let's implement it.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

afcf4735

sched: Use an integer for thread ids · 5c652796

Glauber Costa authored 11 years ago

Linux uses a 32-bit integer for pid_t, so let's do it as well. This is because
there are function in which we have to return our id back to the application.
One application is gettid, that we already have in the tree.

Theoretically, we could come up with a mapping between our 64-bit id and the
Linux one, but since we have to maintain the mapping anyway, we might as well
just use the Linux pids as our default IDs. The max size for that is 32-bit. It
is not enough if we're just allocating pids by bumping the counter, but again,
since we will have to maintain the bitmaps, 32-bit will allow us as much as 4
billion PIDs.

avi: remove unneeded #include

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

5c652796

sched: initialize clock later · 1d31d9c3

Glauber Costa authored 11 years ago

Right now we are taking a clock measure very early for cpu initialization.
That forces an unnecessary dependency between sched and clock initializations.

Since that lock is used to determine for how long the cpu has been running, we
can initialize the runtime later, when we init the idle thread. Nothing should
be running before it. After doing this, we can move the sched initialization
a bit earlier.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

1d31d9c3

vfs: Fix duplicate in-memory vnodes · e4aad1ba

Raphael S. Carvalho authored 11 years ago


Currently, namei() does vget() unconditionally if no dentry is found.
This is wrong because the path can be a hard link that points to a vnode
that's already in memory.

To fix the problem:

  - Use inode number as part of the hash in vget()

  - Use vn_lookup() in vget() to make sure we have one vnode in memory
    per inode number.

  - Push the vget() calls down to individual filesystems and make
    VOP_LOOKUP return an vnode

  - Drop lock in vn_lookup() and assert that vnode_lock is held.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

e4aad1ba

Dec 05, 2013

net: fix socket ioctl() bypassing Linux adjustments · 6b071736

Avi Kivity authored 11 years ago


Prior to 65ccda4c (net: use a file derived class for sockets
 (socket_file)), ioctl()s for socket were directed to linux_ioctl_socket()
and thence to soo_ioctl().  However that commit short-circuited
linux_ioctl_socket() out and dipatched directly to what was previously
known as soo_ioctl() (and became socket_file::ioctl()).  The caused
interface enumeration ioctl()s to fail, for example in Cassandra.

Fix by bringing back the previous behaviour.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

6b071736

sched: add function to find a thread given its id · a5a3aedc

Glauber Costa authored 11 years ago


Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

a5a3aedc

sched: change thread list into an unordered map · 54a0beff

Glauber Costa authored 11 years ago

A list can be slow to search for an element if we have many threads. Even under
normal load, the number of threads we span may not be classified as huge, but it
is not tiny either.

Change it to a map so we can implement functions that operate on a given thread
without that much overhead - O(1) for the common case. Note that ideally we would
use an unordered_set, that doesn't require an extra key. However, that would also
mean that the key is implicit and set to be of type key_type&. Threads are not very
lightweight to create for search purposes, so we go for a id-as-key approach.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

54a0beff

sched: remove on_thread_stack · 9bd939f8

Glauber Costa authored 11 years ago


no users in tree.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

9bd939f8

core: make osv::run return shared pointer or null and store it on loader.cc · fbb54062

Benoît Canet authored 11 years ago


This restore the original behavior of osv::run in place before the mkfs.so and
cpiod.so split committed a day ago.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

fbb54062

Dec 04, 2013

file: remove fileops · c67f9ebf

Avi Kivity authored 11 years ago


Everyone is now overriding file's virtual functions; we can make them
pure virtual and remove fileops completely.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

c67f9ebf

file: remove badfileops · 57741446
Avi Kivity authored 11 years ago
```
Unused.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
```
57741446
vfs: convert vfs files to be derived from class file · b086d996
Avi Kivity authored 11 years ago
```
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
```
b086d996