Commits · 9c9262b00f372f4c16cded5e67ec1ce27d888f1e · Verlässliche Systemsoftware / projects / osv

Dec 10, 2013

sched: remove thread::_ref_counter · 9c9262b0

Avi Kivity authored 11 years ago


As ref() is now never called, we can remove the reference counter and
make unref() unconditional.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

9c9262b0

sched: change wake_with() to use rcu locking · 3dec9895

Avi Kivity authored 11 years ago


Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

3dec9895

sched: add a wake() function that is safe to use on a thread that may terminate · dc40b49e

Avi Kivity authored 11 years ago


One problem with wake() is, if the thread that it is waking can cuncurrently
exit, that it may touch freed memory belonging to the thread structure.

Fix by separating the state that wake() touches into a detached_state
structure, and free that using rcu.

Add a thread_handle class that references only this detached state, and
accesses it via rcu.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

dc40b49e

rcu: make rcu_ptr default initialize to a reasonable value · a828b340

Avi Kivity authored 11 years ago


Makes it much easier to use.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

a828b340

rcu: add preempt_lock_in_rcu · 18cee681

Avi Kivity authored 11 years ago


rcu_read_lock disables preemption, but this is an implementation detail
and users should not make use of it.

Add preempt_lock_in_rcu that takes advantage of the implementation detail
and does nothing, but allows users to explicitly disable preemption.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

18cee681

rcu: forward declare preempt_enable() to avoid #include hell · dff28306

Avi Kivity authored 11 years ago


Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

dff28306

mman: Fix errno handling in mmap and munmap · 2358ac62

Pekka Enberg authored 11 years ago


Nadav Har'El explains:

  Traditionally, functions which succeed do NOT set errno to zero, but
  rather leave it unchanged (errno(3) on Linux says, for example, that
  "errno is never set to zero by any system call or library function.").

Reviewed-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

2358ac62

tests: Add munmap tests into tst-mmap-file · 06d3b771

Raphael S. Carvalho authored 11 years ago


Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

06d3b771

libc: Add munmap validation · 8c57f767

Raphael S. Carvalho authored 11 years ago


Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

8c57f767

mmu: support MAP_UNINITIALIZED flag · f7249e73

Glauber Costa authored 11 years ago


When seeing this flag, pages fault in should not be filled with zeroes or any
other patterns, and should rather be just left alone in whatever state we find
them at.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

f7249e73

Dec 09, 2013

build: Fix some debug build errors · 2a4f991d

Vlad Zolotarov authored 11 years ago


Add -Wno-maybe-uninitialized to compilation flags when mode=debug to
avoid bogus compilation errors.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

2a4f991d

runtime: fix prio_find_thread() ignoring missing threads · 0fd4f259

Avi Kivity authored 11 years ago


prio_find_thread() is not checking correctly for missing threads, and
may return nulls to the caller.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

0fd4f259

Implement mknod() · dd701e2d

Nadav Har'El authored 11 years ago


I tried using a test which called mknod() (to create an empty regular file).
Despite us having an mknod() implementation, it didn't work, and failed on
lookup of the symbol __xmknod.

Turns out that in glibc, mknod() is source-only, and converted to the ABI
function which is __xmknod, whose first parameter is a version number
_MKNOD_VER_LINUX (0 on x86-64 Linux).

So this patch implements __xmknod, and now mknod() works.

Note we already had the same kind of trick for __xstat(), needed so that
stat() would work.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

dd701e2d

libc/mount: Change umount2 and add umount · 2050ce8c

Raphael S. Carvalho authored 11 years ago

umount2 should call sys_umount2 instead. Add umount that calls sys_umount.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

2050ce8c

loader.py: Skip inlined frames in 'osv info threads' · 82cd0548

Tomasz Grabiec authored 11 years ago


GDB python API does not handle inlined functions as nicely as regular
'backtrace' command does. Part of the frame attributes point to
the inlined function (variables, symtab) and part point to the caller.

For example frame.function() returns the nearest non-inlined function.

This breaks code which prints thread joining information.
The code thinks it's in "sched::thread::join()" when actually it's in
sched::schedule() context which does not have 'this' variable.

This solution skips inlined functions when considering print
candidates.  The printed information would be confusing anyway: file
and line number would be of the inlined function but printed function
name would belong to the caller. Finally we will reach the non-inlined
caller and print the call site properly.

Fixes #124.

Signed-off-by: Tomasz Grabiec <tgrabiec@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

82cd0548

loader.py: Make 'osv info threads' not fail when all frames are blacklisted · 9e91603f

Tomasz Grabiec authored 11 years ago


When all resolved frames are blacklisted we try to print the oldest
resolved frame.

Signed-off-by: Tomasz Grabiec <tgrabiec@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

9e91603f

virtio-rng: Do not call queue->get_buf_gc() with wait_until · f0706cec

Asias He authored 11 years ago


Signed-off-by: Asias He <asias@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

f0706cec

virtio-blk: Do not call queue->get_buf_gc() with wait_until · 1cc3dee7

Asias He authored 11 years ago


When I hacked use_indirect() to always use indirect buffer, I saw this
assertion when running:

   $scripts/run.py  -e "/tests/tst-bdev-write.so vblk1"

   VFS: mounting devfs at /dev
   51.671 Mb/s
   Assertion failed: _status.load() == status::running
   (/home/asias/src/cloudius-systems/osv/core/sched.cc: prepare_wait: 655) Aborted

It turned out that we are making a waiting thread waiting again. get_buf_gc()
calls free which might make the thread in waiting state again.

Suggested-by: Dor Laor <dor@cloudius-systems.com>
Signed-off-by: Asias He <asias@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

1cc3dee7

virtio: Add vring::used_ring_can_gc() helper · b7f8fa6d

Asias He authored 11 years ago


It is useful to test if we can do gc on the used ring.

Signed-off-by: Asias He <asias@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

b7f8fa6d

tests: Test huge page allocation failure. · 4eb97417

Glauber Costa authored 11 years ago

Our implementation of operate() will try to fill as much as the address space
as possible with huge pages. If that fails, we should be able to fill the range
with small pages instead of failing. This test should make sure that in such
scenarios, the resulting mapping looks sane.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

4eb97417

mmu: don't bail out on huge page failure · eeeaf888

Glauber Costa authored 11 years ago


Addressing that FIXME, as part of my memory reclamation series. But this
is ready to go already. The goal is to retry to serve the allocation if a
huge page allocation fails, and fill the range with the 4k pages.

The simplest and most robust way I've found to do that was to propagate the
error up until we reach operate(). Being there, all we need to do is to
re-walk the range with 4k pages instead of 2Mb.

We could theoretically just bail out on huge pages and move hp_end, but,
specially when we have reclaim, it is likely that one operation will fail while
the upcoming ones may succeed.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
[ penberg: s/NULL/nullptr/ ]
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

eeeaf888

Reindent fs/vfs/main.cc · eb8451a0

Nadav Har'El authored 11 years ago


main.cc was still using tab characters instead of spaces as our coding
conventions dictate. Reindent it, using Eclipse's ctrl-I.
This patch doesn't change anything else.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

eb8451a0

Revert "vfs: Fix duplicate in-memory vnodes" · 0984e12e

Pekka Enberg authored 11 years ago


This reverts commit e4aad1ba.

It causes tst-vfs.so to just hang.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

0984e12e

Revert "Add tests into tst-fs-link.so to check vnode duplicity" · dfeb8a13
Pekka Enberg authored 11 years ago
```
This reverts commit bdd99c7b.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
```
dfeb8a13

Some small fixes to tst-pipe.cc · 99456e07

Nadav Har'El authored 11 years ago

tst-pipe.cc read a buffer after freeing it, which could have theoretically
caused segfaults (it didn't in practice, but better fix this oversight).

Also, it forgot to return a return code, so it doesn't play nicely in
a test framework like testrunner.so. I'm surprised the C++ compiler wasn't
bothered by an int main() not returning an int.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

99456e07

libc: Fix remove() return value · 7a986ba7

Nadav Har'El authored 11 years ago

The remove() function is part of the ISO C 1989 standard, and used, for
example, to implement Java's File.delete(). It's supposed to remove a
file, regardless of whether unlink() or rmdir() is needed to remove it.

Our implementation (from Musl's) assumed that unlink() on a directory fails
with EISDIR, and only on that case it tried rmdir(). However, returning
EISDIR on unlink() is a Linux extension, which (deliberately) goes against
the Posix standard - which specified EPERM should be returned in that case.
Our ZFS implementation of unlink, following Solaris and FreeBSD (and not
Linux), returns EPERM in that case.

This meant that remove() used to fail deleting empty directories, and
Java code (like the SpecJVM2008 "derby" benchmark) using it to recursively
delete a directory, left behind undeleted empty directories.

So this patch fixes remove() to try rmdir() if unlink() returned either
the Linux-specific EISDIR, or the Posix-standard EPERM. It also adds
to the readdir test another test which verifies that remove() can delete
all files in a directory - both regular files and empty directories.

Fixes #112.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

7a986ba7

Don't assert() in tests/tst-readdir.cc · 6e549175

Nadav Har'El authored 11 years ago

Before this patch, tst-readdir.cc assert()ed on every test, meaning that
a failure will cause a crash. Change this to use a report() function,
which counts failures instead of immediately crashing on the first one.

This patch doesn't change anything in what this test actually tests for.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

6e549175

build: use -Og for mode=debug, if available · f44ed7b9

Avi Kivity authored 11 years ago


gcc recommends -Og for debugging; follow its advice.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

f44ed7b9

virtio: Fix vring::use_indirect · 6204bf4e

Asias He authored 11 years ago


When the _avail_count is less than 1/3 of the ring, we start using
indirect descriptor.

Signed-off-by: Asias He <asias@cloudius-systems.com>
Signed-off-by: Dor Laor <dor@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

6204bf4e

virtio: Do not scale desc_needed by _num / 2 · 24bbcd78

Asias He authored 11 years ago


There is no reason we should do the scale.

Signed-off-by: Asias He <asias@cloudius-systems.com>
Signed-off-by: Dor Laor <dor@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

24bbcd78

Dec 08, 2013

tests: add test for thread completion · d213c3fe

Glauber Costa authored 11 years ago

That test goes together with thread detach, but I am also calling joins
to make sure we're not breaking them. It is unfortunate that this is quite
non-deterministic and we can't really surely test for failure. But on the
flip side, it did help me catch a couple of bugs in my implementation. So
it will eventually explode somewhere if a bug appears.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

d213c3fe

sched: implement pthread_detach · afcf4735

Glauber Costa authored 11 years ago


I needed to call detach in a test code of mine, and this is isn't implemented.
The code I wrote to use it may or may not stay in the end, but nevertheless,
let's implement it.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

afcf4735

sched: standardize call to _cleanup · d754d662

Glauber Costa authored 11 years ago

set_cleanup is quite a complicated piece of code. It is very easy to get it to
race with other thread destruction sites, which was made abundantly clear when
we tried to implement pthread detach.

This patch tries to make it easier, by restricting how and when set_cleanup can
be called. The trick here is that currently, a thread may or may not have a
cleanup function, and through a call to set_cleanup, our decision to cleanup
may change.

From this point on, set_cleanup will only tell us *how* to cleanup. If and
when, is a decision that we will make ourselves. For instance, if a thread
is block-local, the destructor will be called by the end of the block. In
that case, the _cleanup function will be there anyhow: we'll just not call
it.

We're setting here a default cleanup function for all created threads, that
just deletes the current thread object. Anything coming from pthread will try
to override it by also deleting the pthread object. And again, it is important
to node that they will set up those cleanup function unconditionally.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

d754d662

sched: Use an integer for thread ids · 5c652796

Glauber Costa authored 11 years ago

Linux uses a 32-bit integer for pid_t, so let's do it as well. This is because
there are function in which we have to return our id back to the application.
One application is gettid, that we already have in the tree.

Theoretically, we could come up with a mapping between our 64-bit id and the
Linux one, but since we have to maintain the mapping anyway, we might as well
just use the Linux pids as our default IDs. The max size for that is 32-bit. It
is not enough if we're just allocating pids by bumping the counter, but again,
since we will have to maintain the bitmaps, 32-bit will allow us as much as 4
billion PIDs.

avi: remove unneeded #include

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

5c652796

xen: disable pvclock for more than 32 CPUs · 0ddd6ef1

Glauber Costa authored 11 years ago

Xen's shared info contains hardcoded space for only 32 CPUs. Because we use
those structure to derive timing information, we would be basically accessing
random memory after that. This is very hard to test and trigger, so what I'd
did to demonstrate I was right (although that wasn't really needed, math could
be used for that...) was to print the first timing information a cpu would
produce. I could verify that the timing on CPUs > 32 was behind in time than
the time produced in CPUs < 32.

It is possible to move the vcpu area to a different location, but this is a
relatively new feature of the Xen Hypervisor: Amazon won't support it. So
we need a disable path anyway. I will open up an issue for somebody to implement
that support eventually.

Another user of the vcpu structure is interrupts. But for interrupts the story
is easier, since we can select which CPUs we can take interrupts at, and only
take them in the first 32 CPUs. In any case, we're taking them all in CPU0 now,
so already under control

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

0ddd6ef1

sched: initialize clock later · 1d31d9c3

Glauber Costa authored 11 years ago

Right now we are taking a clock measure very early for cpu initialization.
That forces an unnecessary dependency between sched and clock initializations.

Since that lock is used to determine for how long the cpu has been running, we
can initialize the runtime later, when we init the idle thread. Nothing should
be running before it. After doing this, we can move the sched initialization
a bit earlier.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

1d31d9c3

xen: int vs long issues - OSv side · 1bbe05dd

Glauber Costa authored 11 years ago


It seems that we also had problems with our own code for int vs long
issues. I am really surprised that the C++ compiler didn't throw any
warnings for this since all word sizes are quite explicit. In any case,
this seems to be the missing piece for xen booting with many CPUs.

It boots fine now with up to 32 CPUs. After that, other problems start
to appear.

Fixes #113

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

1bbe05dd

Add tests into tst-fs-link.so to check vnode duplicity · bdd99c7b

Raphael S. Carvalho authored 11 years ago


Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

bdd99c7b

vfs: Fix duplicate in-memory vnodes · e4aad1ba

Raphael S. Carvalho authored 11 years ago


Currently, namei() does vget() unconditionally if no dentry is found.
This is wrong because the path can be a hard link that points to a vnode
that's already in memory.

To fix the problem:

  - Use inode number as part of the hash in vget()

  - Use vn_lookup() in vget() to make sure we have one vnode in memory
    per inode number.

  - Push the vget() calls down to individual filesystems and make
    VOP_LOOKUP return an vnode

  - Drop lock in vn_lookup() and assert that vnode_lock is held.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

e4aad1ba

Dec 07, 2013

loader.py: unbreak info threads · d604307d

Glauber Costa authored 11 years ago

My thread patch broke info threads. My bad: Nadav noticed in his review
that it would, but I ended up forgetting about when I reworked it. In
any case, with the following fix it works again.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

d604307d