Commits · 0102df2914224a4c8673b7e325bcf25054cab69c · Verlässliche Systemsoftware / projects / osv

Jun 05, 2013

trace: add unique ID for tracepoints · 0102df29

Avi Kivity authored 11 years ago

In order to optimize the fast path of tracepoints, we need to patch
the call sites to skip calling the slow path code completely.  In turn,
that requires that each call site be unique -- a separate function.

In the current implementations, tracepoints with the same signature map
to the same type.  It would have been great to use the name as a discriminant
(tracepoint<"sched_queue", thread*> trace_sched_queue(...);), but C++ does
not support string literals as template arguments.

We could do

  const char* trace_sched_queue_name = "sched_queue";
  tracepoint<trace_sched_queue_name, thread*> trace_sched_queue(...);

but that doubles the code for declaring a tracepoint.  Add a unique ID instead
(and code to verify it is unique).

0102df29

May 30, 2013

Move unsupported fileops to fs/unsupported.c · 5062ff4f

Nadav Har'El authored 11 years ago

Previously, we re-implemented "unsupported" file operations (e.g., chmod
for a pipe on which fchmod makes no sense) several times - there was
an implementation only for chmod in kern_descrip.c, used in sys_socket.c,
and af_local.cc had its own. As we add more file descriptor type (e.g.,
create_epoll()) we'll have even more copies of these silly functions, so
let's do it once in fs/unsupported.c - with the fs/unsupported.h header
file.

This also gives us a central place to document (and argue) whether an
unimplemented ioctl() should return ENOTTY or EBADF (I think the former).

5062ff4f

May 27, 2013

debug: introduce debug_ll() and use it in abort() · 6ebb582e

Guy Zana authored 11 years ago

the debug() console function is taking a lock before it access the console driver,
it does that by acquiring a mutex which may sleep.

since we want to be able to debug (and abort) in contexts where it's not possible sleep,
such as in page_fault, a lockless debug print method is introduced.

previousely to this patch, any abort on page_fault would cause an "endless" recursive
abort() loop which hanged the system in a peculiar state.

6ebb582e

May 26, 2013

elf: add support for IRELATIVE relocations · e8c62c5e

Avi Kivity authored 11 years ago

This are used to support ifunc functions, which are resolved at load-time
based on cpu features, rather than at link time.

e8c62c5e

May 24, 2013
- prex: remove the file_t typedef · 0617cbf0
  Christoph Hellwig authored 11 years ago
  
  0617cbf0
- prex: remove the mount_t typedef · 1a1f5689
  Christoph Hellwig authored 11 years ago
  
  1a1f5689
- prex: remove the vnode_t typedef · 0896c991
  Christoph Hellwig authored 11 years ago
  
  0896c991
- remove <osv/list.h> · 6b521b19
  Christoph Hellwig authored 11 years ago
  
  6b521b19
- convert bio list to <sys/queue.h> · d6804d3a
  Christoph Hellwig authored 11 years ago
  
  d6804d3a
- convert buffer list to <sys/queue.h> · b79f402f
  Christoph Hellwig authored 11 years ago
  
  b79f402f
- convert mount list to <sys/queue.h> · 132f6b7a
  Christoph Hellwig authored 11 years ago
  
  132f6b7a
- convert vnode hash list to <sys/queue.h> · ad03f3dd
  Christoph Hellwig authored 11 years ago
  
  ad03f3dd
- convert the poll list to <sys/queue.h> · 277598d7
  Christoph Hellwig authored 11 years ago
  
  277598d7
May 23, 2013
- sched: tighten up cpu_set memory barriers · 0e70982b
  Avi Kivity authored 11 years ago
  
  0e70982b
- sched: improve cpu_set's atomic load-and-clear · 122c7d50
  Avi Kivity authored 11 years ago
  
  If there's nothing in the cpu_set (which is fairly common), there's no need to use an atomic operation.
  122c7d50
May 22, 2013

This patch implements two functions: · 121d6a7e

Nadav Har'El authored 11 years ago

 1. osv::poweroff(), which can turn off a physical machine or in our case
    tell QEMU to quit.
    The implementation uses ACPI, through the ACPICA library.

 2. osv::hang(), which ceases all computation on all cores, but does not
    turn off the machine. This can be useful if we want QEMU to remain
    alive for debugging, for example.

The two functions are defined in the new <osv/power.hh> header files, and
follow the new API guidelines we discussed today: They are C++-only, and
are in the "osv" namespace.

121d6a7e

Added timeout parameter to semaphore::wait() · 7890e39a

Nadav Har'El authored 11 years ago

Added a timeout parameter to semaphore::wait(), which defaults to no
timeout.

semaphore:wait() is now a boolean, just like trywait(), and likewise can
return false when the semaphore has not actually been decremented but
rather we had a timeout.

Because we need the mutex again after the wait, I replaced the "with_lock"
mechanism by the better-looking lock_guard and mutex parameter to
wait_until.

7890e39a

Drasticly lower overhead of leak detection on running program · 92ff9880

Nadav Har'El authored 11 years ago

Leak detection (e.g., by running with "--leak") used to have a devastating
effect on the performance of the checked program, which although was
tolerable (for leak detection, long runs are often unnecessary), it was
still annoying.

While before this patch leak-detection runs were roughly 5 times slower
than regular runs, after this patch they are only about 40% slower than
a regular run! Read on for the details.

The main reason for this slowness was a simplistic vector which was used to
keep the records for currently living allocations. This vector was linearly
searched both for free spots (to remember new allocations) and for specific
addresses (to forget freed allocations). Because this list often grew to a
hundred thousand of items, it became incredibly slow and slowed down the
whole program. For example, getting a prompt from cli.jar happened in 2
seconds without leak detection, but in 9 seconds with leak detection.

A possible solution would have been to use an O(1) data structure, such as
a hash table. This would be complicated by our desire to avoid frequent
memory allocation inside the leak detector, or our general desire to avoid
complicated stuff in the leak detector because they always end leading to
complicated deadlocks :-)

This patch uses a different approach, inspired by an idea by Guy.

It still uses an ordinary vector for holding the records, but additionally
keeps for each record one "next" pointer which is used for maintaining two
separate lists of records:

1. A list of free records. This allows a finding a record for a new
allocation in O(1) time.

2. A list of filled records, starting with the most-recently-filled record.
When we free(), we walk this list and very often finish very quickly,
because malloc() closely followed by free() are very common.
Without this list, we had to walk the whole vector filled with ancient
allocations and even free records, just to find the most recent
allocation.

Two examples of the performance with and without this patch:

1. Getting a prompt from cli.jar takes 2 seconds without leak detection,
9 seconds with leak detection before this patch, and 3 seconds with this
patch.

2. The "sunflow" benchmark runs 53 ops/second without leak detection,
which went down to 10 ops/second with leak detection before this patch,
and after this patch - 33 ops/second.

I verified (by commenting out the search algorithm and always using the
first item in the vector) that the allocation record search is no longer
having any effect on performance, so it is no longer interesting to replace
this code with an even more efficient hash table. The remaining slowdown is
probably due to the backtrace() operation and perhaps also the tracker lock.

92ff9880

semaphore: convert to an intrusive list · 5c2b2f94
Avi Kivity authored 11 years ago
```
Intrusive lists are faster since they require no allocations.
```
5c2b2f94

semaphore: remove indirection accessing internal mutex · b9953a90

Avi Kivity authored 11 years ago

Previously, the mutex was stored using a pointer to avoid overflowing
glibc's sem_t.  Now we no longer have this restriction, drop the indirection.

b9953a90

semaphore: add support for multiple units in wait() and post() · c240daa3
Avi Kivity authored 11 years ago
```
Use Nadav's idea of iterating over the list and selecting wait records
that fit the available units.
```
c240daa3
semaphore: extract generic semaphore from pthread semaphore implementation · 747ff478
Avi Kivity authored 11 years ago
```
No code changes.
```
747ff478

May 21, 2013

In leak detector, remember most recent functions · 6db3a806

Nadav Har'El authored 11 years ago

As Avi suggested, add an option (turned on by default) to remember only
the most recent function calls - instead of the most high-level function
calls like I did until now - in an allocation's stack trace.

In our project, where we often don't care about the top-level
functions (various Java stuff), it is more useful.

6db3a806

May 20, 2013

mutex: improve compatibility with pthread_mutex_t initializers · 97ee8ac1

Avi Kivity authored 11 years ago

pthread_mutex_t has a 32-bit field, __kind, at offset 16.  Non-standard
static initializers set this field to a nonzero value, which can corrupt
fields in our implementation.

Rearrange field layout so we have a hole in that position.  To keep the
structure size small enough so that condvar will still fit in
pthread_condvar_t, we need to change the size of the _depth field to
16 bits.

97ee8ac1

sched: fold detached_thread into thread · 4f334fd8

Avi Kivity authored 11 years ago

Instead of a subclass, make it a thread attribute.  This simplifies usage
and also allows detaching a thread after creation.

4f334fd8

sched: add facility to execute a cleanup function when a thread is destroyed · cc79c49b
Avi Kivity authored 11 years ago
```
Detached threads are auto collected, so give users a chance to execute some
cleanup function before dying.
```
cc79c49b

The default backtrace depth for the alloc tracker of 20 wasn't enough · 2c1eb668

Nadav Har'El authored 11 years ago

for the very deep calls in Java. Increase it.

In the future, I should have an option to save only the deepest calls,
not the calls nearest the root.

2c1eb668

Fix bug in leak detection interaction with mmap() code · 064e99d5

Nadav Har'El authored 11 years ago

mmu::allocate(), implementing mmap(), used to first evacuate the
region (marking it free), then allocate a tiny vma object (a start,end
pair), and finally populate the region.

But it turns out that the allocation, if it calls backtrace() for the first
time, ends up calling mmap() too :-) These two running mmap()s aren't
protected by the mutex (it's the same thread), and the second mmap could
take the region just freed by the first mmap - before returning to the
first mmap who would reuse this region.

We solve this bug by allocating the vma object before evacuating the
region, so the other mmap picks different memory.

Before this fix, "--leak tests/tst-mmap.so" crashes with assertion
failure. With this fix, it succeeds.

064e99d5

Add partial implementation of msync() for libunwind · de374193

Nadav Har'El authored 11 years ago

libunwind, which the next patches will use to implement a more reliable
backtrace(), needs the msync() function. It doesn't need it to actually
sync anything - just to recognize valid frame addresses (stacks are
always mmap()ed).

Note this implementation does the checking, but is missing the "sync" part
of msync ;-) It doesn't matter because:

1. libunwind doesn't need (or want) this syncing, and neither does anything
else in the Java stack (until now, msync() was never used).

2. We don't (yet?) have write-back of mmap'ed memory anyway, so there's
no sense in doing any writing in msync either. We'll need to work on
a full read-write implementation of file-backed mmap() later.

de374193

May 18, 2013
- include: add glibc compatibility directory · c61691f1
  Avi Kivity authored 11 years ago
  
  include/glibc-compat defines headers that munge the musl headers (by including other headers or defining some symbols) and using #include_next. Eventually it should go away, but for now it reduces churn.
  c61691f1
- include: define UIO_MAXIOV in uio.h · 3b78dfd8
  Avi Kivity authored 11 years ago
  
  musl doesn't provide it.
  3b78dfd8
- include: remove unneeded cdefs.h include from prex.h · 0777bdf5
  Avi Kivity authored 11 years ago
  
  0777bdf5
- mutex: remove __always_inline · d74dd39c
  Avi Kivity authored 11 years ago
  
  With __musl's definition, the compile fails. It isn't justified anyway.
  d74dd39c
- include: remove duplicate ioctls · b831c573
  Avi Kivity authored 11 years ago
  
  We got some from musl, remove it from our list.
  b831c573
- include: add missing include to fcntl.h · 5d3c0662
  Avi Kivity authored 11 years ago
  
  Need to get __BEGIN_DECLS from somewhere.
  5d3c0662
- include: add missing ioctl definition · 4b3857c9
  Avi Kivity authored 11 years ago
  
  SIOCSIFNAME (which is named after the BSD creator's daughter, incidentally).
  4b3857c9
- include: fix up type definitions · eab643d8
  Avi Kivity authored 11 years ago
  
  Add missing symbols to make bsd happy.
  eab643d8
- include: fix gettimeofday() declaration · 95d97a07
  Avi Kivity authored 11 years ago
  
  Missing timezone type.
  95d97a07
- include: comment out sysctl() declaration · c48a0dde
  Avi Kivity authored 11 years ago
  
  confused with bsd's
  c48a0dde
- include: add missing definitions to sys/stat.h · 8b504ba0
  Avi Kivity authored 11 years ago
  
  8b504ba0