Commits · fe5029da98ea3a50ab4d450a16cf055c5553f3a6 · Verlässliche Systemsoftware / projects / osv

May 27, 2013
- tests: add iterations to the TCPDownloadFile test · fe5029da
  Guy Zana authored 11 years ago
  
  making it even more stressful ;)
  fe5029da
- bsd: add -D SMP to build.mak, used by atomic.h · 03fef2b5
  Guy Zana authored 11 years ago
  
  the atomic operations in atomic.h weren't really atomic. this is something that was missed in the netport and now fixed.
  03fef2b5
- Update misc.bin for boost-system · 78bff744
  Avi Kivity authored 11 years ago
  
  78bff744
- external: update misc.bin for boost-filesystem · 626a908b
  Avi Kivity authored 11 years ago
  
  626a908b
- provide a utsname structure · 43c3f6dd
  Christoph Hellwig authored 11 years ago
  
  ZFS wants direct access to a global utsname structure. Provide one from core OSv code and rewrite uname to just copy it out. To ease this move the uname implementation to a C file as this allows using designated initializers and avoids the casting mess around memcpy.
  43c3f6dd
- debug: introduce debug_ll() and use it in abort() · 6ebb582e
  Guy Zana authored 11 years ago
  
  the debug() console function is taking a lock before it access the console driver, it does that by acquiring a mutex which may sleep. since we want to be able to debug (and abort) in contexts where it's not possible sleep, such as in page_fault, a lockless debug print method is introduced. previousely to this patch, any abort on page_fault would cause an "endless" recursive abort() loop which hanged the system in a peculiar state.
  6ebb582e
- abort: debug() may cause an abort() as well · 9ef87755
  Guy Zana authored 11 years ago
  
  the current code handles the case of recursive aborts incorrectly, while the existing comment is very precise :)
  9ef87755
- zfs: enable the solaris compat <sys/vnode.h> · 1cf30084
  Christoph Hellwig authored 11 years ago
  
  This allows to remove various #if 0'ed code using vnode_t or znode_t to be compiled, both in the current headers and future ported code.
  1cf30084
- zfs: use get_cpuid() · 5799fb60
  Christoph Hellwig authored 11 years ago
  
  5799fb60
- solaris: allow code using TASKQ_THREADS_CPU_PCT to build · c23a9917
  Christoph Hellwig authored 11 years ago
  
  c23a9917
- solaris: define ptob using PAGE_SIZE instead of PAGE_SHIFT · 7794d411
  Christoph Hellwig authored 11 years ago
  
  7794d411
- solaris: provide more credential related stubs · 13bd4625
  Christoph Hellwig authored 11 years ago
  
  13bd4625
- solaris: include the right <sys/param.h> · c94c43ff
  Christoph Hellwig authored 11 years ago
  
  c94c43ff
- solaris: provide an issig stub · f49ddbca
  Christoph Hellwig authored 11 years ago
  
  f49ddbca
- netport: provide a proc0 definition · b7bdb42c
  Christoph Hellwig authored 11 years ago
  
  BSD and Solaris code likes to pass this identifier for the "kernel" process to various thread creation routines. Make our life simpler by providing it and ignoring it.
  b7bdb42c
- netport: provide physical memory size information · c55bb8b9
  Christoph Hellwig authored 11 years ago
  
  c55bb8b9
May 26, 2013

Fix comment · 0ad3e2e0

Nadav Har'El authored 11 years ago

The comment about unlocking the irq_lock was put on the wrong line.
Move it (and rephrase it a bit - the word "release" immediately after
calling an unrelated release() function - is confusing).

0ad3e2e0

Fix two bugs in yield() · 19e52ce6

Nadav Har'El authored 11 years ago

yield() had two bugs - thanks to Avi for pinpointing them:

1. It used runqueue.push_back() to put the thread on the run queue, but
push_back() is a low-level function which can only be used if we're
sure that the item we're pushing has the vruntime suitable for being
*last* on the queue - and in the code we did nothing to ensure this
is the case (we should...). So use insert_equal(), not push_back().

2. It was wrongly divided into two separate sections with interrupts
disabled. The scheduler code is run both at interrupt time (e.g.,
preempt()) and at thread time (e.g., wait(), yield(), etc.) so to
guarantee it does not get called in the middle of itself, it needs
to disable interrupts while working on the (per-cpu) runqueue.
In the broken yield() code, we disabled interupts while adding the
current thread to the run queue, and then again to reschedule.
Between those two critical sections, an interrupt could arrive and
do something with this thread (e.g., migrate it to another CPU, or
preempt it), yet when the interrupt returns yield continues to run
reschedule_from_interrupt which assumes that this thread is still
running, and definitely not on the run queue.

Bug 2 is what caused the crashes in the tst-yield.so test. The fix is
to hold the interrupts disabled throughout the entire yield().
This is easiest done with with lock_guard, which simplifies the flow
of this function.

19e52ce6

sched: avoid unnecessary FPU saving · 947b49ee

Nadav Har'El authored 11 years ago

Because of Linux's calling convention, it should not be necessary to
save the FPU state when a reschedule is caused by a function call.

Because we had a bug and forgot to save the FPU state when calling
a signal handler, and because this signal handler can cause a reschedule,
we had to save the FPU on any reschedule. But after fixing that bug, we
no longer need these unnecessary FPU saves.

The "sunflow" benchmark still runs well after this patch.

947b49ee

tests: fix tst-timer, enable test1() and avoid using hardcoded values · c0eebe80
Guy Zana authored 11 years ago

c0eebe80
tst-ctxsw: refine to have warm-up time and fixed execution time · e7dde95d
Avi Kivity authored 11 years ago

e7dde95d

sched: fix preempt_enable() when interrupts are disabled · 84046f23

Avi Kivity authored 11 years ago

If interrupts are disabled, we must not call schedule() even if
the preemption counter says we need to, as the context is not preemption
safe.

This manifested itself in a wake() within a timer causing a schedule(),
which re-enabled interrupts, which caused further manipulation of the timer
list to occur concurrently with the next interrupt, resulting in corruption.

Fixes timer stress test failure.

84046f23

tests: extend timer test, make it more stressful · 858f7666

Guy Zana authored 11 years ago

noticed an assert in the download file test that was related to timers,
this test reproduce the same bug.

858f7666

signal handling: fix FPU clobbering bug · 94a7015e

Nadav Har'El authored 11 years ago

This patch adds missing FPU-state saving when calling signal handlers.
The state is saved on the stack, to allow nesting of signal handling
(delivery of a second signal while a first signal's handler is running).

In Linux calling conventions, the FPU state is caller-saved, i.e., a
called function can use FPU at will because the caller is assumed to have
saved it if needed. However, signal handlers are called asynchronously,
possibly in the middle of some FPU computation without that computation
getting a chance to save its state. So we must save this state before calling
the signal handling function.

Without this fix, we had problems even if the signal handlers themselves
did not use the FPU. A typical scenario - which we encountered in the
"sunflow" benchmark - is that the signal handler does something which uses
a mutex (e.g., malloc()) and causes a reschedule. The reschedule, not a
preempt(), thinks it does not need to save the FPU state, and the thread
we switch to clobbers this state.

94a7015e

tests: make the TCPDownloadFile test a bit more stressful · 1e66b4eb
Guy Zana authored 11 years ago
```
now it downloads a ~200MB file and validating md5 on it.
```
1e66b4eb
loader.py: revisit osv info callouts given the new implementation · cc6ff2f9
Guy Zana authored 11 years ago

cc6ff2f9
tests: fix tst-bsd-callout test, uncomment test #1 · bbf0aa7b
Guy Zana authored 11 years ago

bbf0aa7b

bsd: rewrite callout mechanism to avoid a race · c5dbdcc8

Guy Zana authored 11 years ago

the old implementation used threads for dispatching callouts, each callout
owned a thread and it suffered from a race where a callout structure could have
been deleted before the callout thread even begun.

the current implementation is dispatching all callouts in a single callout
dispatcher thread, it maintains an ordered list of callouts to achieve that.

this patch solve a crash with the TCPDownloadFile test, that now proceeds.

c5dbdcc8

loader.py: make osv info threads more readable · 4a63055e
Guy Zana authored 11 years ago

4a63055e

uma: fix order to finit/dtor in uma_zfree() · 357d68d7

Guy Zana authored 11 years ago

the mbuf ext buffer is freed in the dtor, so it should be called before finit.
this is fixing a crash that surfaced by using the conf-memory-debug=1

357d68d7

bsd: zero a few uninitialized structures · 62712056

Guy Zana authored 11 years ago

this haven't caused a real bug, I just noticed it while tracing.
it may be dangerous if in some flow, the stack will not be zeroed

62712056

bsd: implement panic() · 91db62cf
Guy Zana authored 11 years ago

91db62cf
libc: fix strerror_r, should not appear as __xpg_strerror_r() · c481b94d
Guy Zana authored 11 years ago
```
strerror_r is needed by the JVM in order to print errors correctly.
```
c481b94d

May 25, 2013
- todo: add todo/fpu · b1ec4adc
  Nadav Har'El authored 11 years ago
  
  A todo item to avoid unnecessary fpu state saving
  b1ec4adc
- todo: add todo/mutex · 682e57fd
  Nadav Har'El authored 11 years ago
  
  Things we still need to do to use the lockfree mutex
  682e57fd
May 24, 2013
- Add "todo" directory. · 6347f632
  Nadav Har'El authored 11 years ago
  
  Following an idea raised during our last discussion on TODOs, I committed a new directory "todo/", and one file in it, todo/mm, with TODOs and ideas for memory management. I suggest that we allow free commits to this directory - with no overhead of "review" of these informal todo items. If we decide we don't like this format, we can easily move the content of these files to some other tool or database or whatever.
  6347f632
- netport: provide a CACHE_LINE_SIZE definition · cccd2064
  Christoph Hellwig authored 11 years ago
  
  cccd2064
- netport: provide a SYSCYTL_UQUAD defintion · a279e346
  Christoph Hellwig authored 11 years ago
  
  a279e346
- netport: provide a TUNABLE_QUAD defintion · 055b338e
  Christoph Hellwig authored 11 years ago
  
  055b338e
- prex: remove the file_t typedef · 0617cbf0
  Christoph Hellwig authored 11 years ago
  
  0617cbf0