Commits · 76dae193232cd9caf548836f0807311921ea2fac · Verlässliche Systemsoftware / projects / osv

Jun 09, 2013

sched: add tracepoint for preemption events · 76dae193
Avi Kivity authored 11 years ago

76dae193

lock-free mutex: add C API for lockfree::mutex methods · d3c156d8

Nadav Har'El authored 11 years ago

Add some extern "C" versions of the lockfree::mutex methods. They will be
necessary for providing the lockfree::mutex type to C code - as you'll see
in later patches, C code will see an opaque type, a byte array, and will
call these functions to operate on it.

d3c156d8

lock-free mutex: Avoid including <sched.hh> in <lockfree/mutex.hh> · 874b10ee

Nadav Har'El authored 11 years ago

Do not include <sched.hh> in <lockfree/mutex.hh>.

Including <sched.hh> creates annoying dependency loops when we (in a
later patch) replace <osv/mutex.h> by <lockfree/mutex.hh>, and some header
files included by <sched.hh> themselves use mutexes, so they include
<osv/mutex.h>. This last include does nothing (because of the include guard)
but the compiler never finished reading osv/mutex.h (it was only in its
beginning, when it included sched.hh) so the inner-included code lacks the
definitions it assumes after including mutex.h.

874b10ee

lockfree mutex: add owned() and getdepth() methods · 1ed9c982

Nadav Har'El authored 11 years ago

Add to lockfree::mutex the simple owned() and getdepth() methods which
existed in ::mutex and were used in a few places - so we need these
methods to switch from ::mutex to lockfree::mutex.

1ed9c982

lockfree mutex: fix wait/wake bug · 9bcf790a

Nadav Har'El authored 11 years ago

When I developed lockfree mutex, the scheduler, preemption, and related code
still had a bunch of bugs, so I resorted to some workarounds that in hindsite
look unnecessary, and even wrong.

When it seemed that I can only wake() a thread in wait state, I made an
effort to enter the waiting state (with "wait_guard") before adding the
thread to the to-awake queue, and then slept with schedule(). The biggest
problem with this approach was that a spurious wake(), for any reason, of
this thread, would cause us to end the lock() - and fail on an assert that
we're the owners of the lock - instead of repeating the wait. When switching
to lockfree mutex, the sunflow benchmark (for example) would die on this
assertion failure.

So now I replaced this ugliness with our familiar idiom, wait_until().
The thread is in running state for some time after entering queue, so
it might be woken when not yet sleeping and the wake() will be ignored -
but this is fine because part of our protocol is that the wake() before
waking also sets "owner" to the to-be-woken thread, and before sleeping
we check if owner isn't already us.

Also changed the comment on "owner" to say that it is not *just* for
implementing a recursive mutex, but also nessary for the wakeup protocol.

9bcf790a

Implement shutdown() on unix domain sockets · 30f6e9dd

Nadav Har'El authored 11 years ago

The existing shutdown() code only worked with AF_INET sockets, and returned
ENOTSOCK for AT_LOCAL sockets, because we implemented the latter sockets in
completely different code (in af_local.cc).

So in uipc_syscalls_wrap.c, the same place we call a the special af-local
socketpair(), we also need to call the special af-local shutdown().

The way we do it is a bit ugly, but effective: shutdown() first calls
shutdown_af_local(), and if that returns ENOTSOCK (so it's not an af_local
socket), we continue trying the regular socket shutdown code.

A better way would have been to add shutdown() to the fileops table -
either the generic one (why not?), or invent a new mechanism whereby
certain file types (in this case, "sockets" of all types) can have additional
ops tables including in this case a shutdown() operation. Linux has
something of this sort for implementing shutdown().

30f6e9dd

Jun 06, 2013

msix: provide high priority handler when registering interrupt · 66066b07

Guy Zana authored 11 years ago

we have to disable virio interrupts before msix EOI so disabling
must be done in the ISR handler context. This patch adds an std::function
isr to the bindings.

references to the rx and tx queues are saved as well (_rx_queue and _tx_queue),
so they can be used in the ISR context.

this patch reduces virtio net rx interrupts by a factor of 450.

66066b07

virtio: expose disable/enable interrupts · 8efa9c02
Guy Zana authored 11 years ago

8efa9c02
virtio: the current queue was already selected · 9e1bd70b
Guy Zana authored 11 years ago

9e1bd70b
virtio: check errors after setting msix entry to queue · f77156cc
Guy Zana authored 11 years ago

f77156cc
virtio: the msix vector and queue select is a word len register · 016f928f
Guy Zana authored 11 years ago

016f928f

New include file, <osv/lazy_indirect.hh>, which implements a template · b2efa58e

Nadav Har'El authored 11 years ago

lazy_indirect<T> which only allocates T, on the heap, on first use. The
lazy_indirect<T> object itself only takes 8 bytes of memory (a single pointer).

This template is useful for implementing pthread_mutex_t and pthread_condvar_t
using the larger lockfree::mutex and condvar (containing a lockfree::mutex)
objects.

b2efa58e

Merge branch 'tracepoint' · 158d5681
Avi Kivity authored 11 years ago
```
Optimize tracepoints fast path to a single nop instruction.
```
158d5681
run: use same networking setup as libvirt's default · a4480853
Avi Kivity authored 11 years ago
```
Allows networking to work without reconfiguration or dhcp.
```
a4480853

trace: relax unique ID requirement · aec5d9df

Avi Kivity authored 11 years ago

Place the tracepointv type in an anonymous namespace.  This makes every
translation unit have its own unique tracepoint types, so we only need
to ensure uniqueness within a source file.

Use the type's type_info to select the correct patch sites.

Idea from Nadav.

aec5d9df

Jun 05, 2013

trace: improve fast path · b03979d9

Avi Kivity authored 11 years ago

When a tracepoint is disabled, we want it to have no impact on running code.

This patch changes the fast path to be a single 5-byte nop instruction. When
a tracepoint is enabled, the nop is patched to a jump instruction to the
out-of-line slow path.

b03979d9

trace: add unique ID for tracepoints · 0102df29

Avi Kivity authored 11 years ago

In order to optimize the fast path of tracepoints, we need to patch
the call sites to skip calling the slow path code completely.  In turn,
that requires that each call site be unique -- a separate function.

In the current implementations, tracepoints with the same signature map
to the same type.  It would have been great to use the name as a discriminant
(tracepoint<"sched_queue", thread*> trace_sched_queue(...);), but C++ does
not support string literals as template arguments.

We could do

  const char* trace_sched_queue_name = "sched_queue";
  tracepoint<trace_sched_queue_name, thread*> trace_sched_queue(...);

but that doubles the code for declaring a tracepoint.  Add a unique ID instead
(and code to verify it is unique).

0102df29

lockfree::mutex functions should not be inline · cb41801a

Nadav Har'El authored 11 years ago

Until now, lockfree::mutex functions were entirely inline, which won't
fly if we want to make it our default mutex. Change them to be out-of-line,
implemented in a new source file core/lfmutex.cc.

This has a slight performance impact - uncontended lock/unlock pair used
to be 17ns, and is now 22ns.

In the future we can get this performance back by making these functions
partially inline (the uncontended case inline, the waiting case in a
separate function), although we'll need to consider the speed/code-size
tradeoff.

cb41801a

Loader: show demangled symbol name on lookup failure · b39289cf

Nadav Har'El authored 11 years ago

When abort()ing on failed symbol lookup, if this is a C++ symbol, also
show its demangled form, which in many case can be more helpful.
Here is an example lookup failure now:

failed looking up symbol _ZN8lockfree5mutex4lockEv (lockfree::mutex::lock())
Aborted

b39289cf

Speed up lockfree::mutex uncontended case · 423f109b

Nadav Har'El authored 11 years ago

Significantly speed up lockfree::mutex's uncontended case, by avoiding
sequential memory ordering in atomic operations. I *think* I did this
correctly, but can't be really sure ;-) Moreover, I didn't change the
memory ordering on the rarer cases, and these should also be reduced in
the future.

Uncontended lock&unlock of lockfree::mutex is now 17ns. This is faster
than the previous mutex (24ns) but slower than spinlock (10ns).

423f109b

Jun 04, 2013

Fix argv handling in RunJava · b4d67a4a

Nadav Har'El authored 11 years ago

The recent change, to add the program name as argv[0] for C code's
main(), make sense for C code, but less for Java code, where main()
normally expects args[0] to be the first argument, not the program name.

So the change to RunJava.java was un-Java-like; It also broke the "java"
CLI command which didn't put "java" in argv[0] for the arguments to
RunJava.main(), so the "java" command no longer worked after the previous
patch.

Instead, we change java.cc (which compiles to java.so). This is what
calls RunJava.class, and it should remove the new argv[0] before calling its
main() - instead of expecting that RunJava.class to do this.

b4d67a4a

netperf: add instruction on how to build netperf for osv · 32f99535
Guy Zana authored 11 years ago

32f99535
fix logging in select · b87d2d26
Guy Zana authored 11 years ago

b87d2d26

loader: don't consume one element of argv before running main() · 1e7452c8

Guy Zana authored 11 years ago

the convention in linux is that argv[0] holds the program executable.
I had an attempt to run netserver not from the CLI and it didn't work because
its argument parsing got broken.

1e7452c8

CLI: Allow running a single command non-interactively · 496d27f8

Nadav Har'El authored 11 years ago

Added the possibility to pass to cli.jar a command, which it runs instead
of taking commands interactively. Note that the initialization script is
run before the given command.

After this patch,

        scripts/run.py -e "java.so -jar /java/cli.jar"

Continues to run the interactive command line editor loop, as before.
But additionally, one can do:

        scripts/run.py -e "java.so -jar /java/cli.jar ls"

To run just the command "ls" and exit - exactly as if the user would type
this command on the command line and exit the VM.

The given command can be, of course, much longer. For example to run Jetty
after the CLI's normal initialization script, the following monster can
be used:

scripts/run.py -n -e "java.so -jar /java/cli.jar java -classpath /jetty/* org.eclipse.jetty.xml.XmlConfiguration /jetty/jetty.xml"

(Funny how a single command should say "java" 3 times and "jetty" 4 times :-))

496d27f8

CLI: Add "java" command · 01cb7973

Nadav Har'El authored 11 years ago

Add a "java" command to the CLI, using the same syntax of java.so and
attempting to emulate as closely as possible the "java" command on Linux.
So for example one can run

        java Hello

to run /java/Hello.class (/java is on the classpath by default), or

        java -jar /java/bench.jar

to run the main class of this jar, or a more sophisticated
command lines, such as the following which runs Jetty (if the
appropriate files are in your image):

        java -classpath /jetty/* org.eclipse.jetty.xml.XmlConfiguration /jetty/jetty.xml

Note that like java.so, the new "java" command basically runs the RunJava
class (/java/RunJava.class). Remember that java.so adds /java to the parent
class loader, so we can always find the RunJava class even though it's not
in cli.jar or cloudius.jar).

01cb7973

Nonblocking pipes · 1fa558ef

Nadav Har'El authored 11 years ago

This patch adds support for O_NONBLOCK on pipes and unix domain sockets.

Java's EPollSelectorImpl uses a pipe to interrupt a sleeping poll, and it,
quite understandably, sets them to non-blocking (if you only write a
single byte to a pipe, you don't expect any blocking anyway).
So we can't croak if this option is used, and better just implement it
correctly.

1fa558ef

Jun 03, 2013

route.js: added a call to route with no arguments · 02a14a49
Guy Zana authored 11 years ago

02a14a49
ifconfig.js: added a call to ifconfig with no arguments · ee8cd808
Guy Zana authored 11 years ago

ee8cd808
run.js: ditch 'run' from arguments in invoke() instead run() · 4a4dfc1d
Guy Zana authored 11 years ago
```
run_cmd.run() is an internal function that can be used by the rest of the cli.
```
4a4dfc1d
run.js: handle absolute paths · f3ce61ba
Guy Zana authored 11 years ago

f3ce61ba
tools: moved tst-lsroute to the tools directory · 21d94018
Guy Zana authored 11 years ago

21d94018
tools: moved tst-ifconfig to a tools directory · 49643339
Guy Zana authored 11 years ago

49643339

tests: remove non-useful / obsolete tests · 68badb50

Guy Zana authored 11 years ago

these tests are a bit outdated, they change the system configuration and are
not useful anymore, they were basically written to understand how stuff works.

tst-bsd-netdriver.c - was made just to figure out the network driver model of
freebsd.
tst-bsd-netisr.c - same for isr layer, this tests runs over the ARP isr and
the system is badly wounded after it runs, it is useless today
and was written to figure out how netisr works.
tst-virtionet.c - testing network interface creation using virtio,
today the interface is created anyway.

68badb50

java.so: wait for other threads to finish · 5384f24f

Nadav Har'El authored 11 years ago

java.cc would exit right after the main() method finished. But in Java,
this is not the correct behavior. Rather, even if main() returns, we
need to wait for all other threads to end (or more accurately, wait
for all threads not marked with setDaemon(true)).

Calling jvm->DestroyJavaVM() does this for us, and it's probably the
Right Thing(TM) to do anyway.

Before this patch, the Jetty benchmark exited immediately after
startup. After this patch, its worker threads keep the whole VM running.

5384f24f

Implement __libc_stack_end · 60655973

Nadav Har'El authored 11 years ago

Java doesn't trust pthread_getattr_np() to work for the main thread in
Linux, so instead it relies on a global variable __libc_stack_end, which
we didn't implement and therefore causing an annoying message.

This patch implements __libc_stack_end. Pardoxically, this shouldn't point
to the real stack end (we can easily find this our sched::thread interfaces)
but a little below it, because Java expects the address to be in the
stack, not one byte above it. So we use __builtin_frame_address(0) to
find the current frame's address.

Unfortunately, while this elliminates one warning message, another one
remains - because Java later expects to read /proc/self/maps and doesn't
find it.

60655973

Jun 02, 2013

Split af_local.cc into four files · 729bdbd5

Nadav Har'El authored 11 years ago

The source file af_local.cc implemented both pipes and bi-directional
pipes (unix domain stream socketpair), using a common buffer implemetation.

As suggested by Guy, split this file into four files:

pipe_buffer.cc and pipe_buffer.hh contain the common buffer implementation,
class pipe_buffer. Since this buffer basically implements a single-direction
pipe, I renamed it from "af_local_buffer" to pipe_buffer.

af_local.cc now contains just the unix domain stream socketpair
implementation, implemented using two pipe_buffer objects.

af_pipe.cc contains the Posix pipe() implementation, implemented using
one pipe_buffer object..

729bdbd5

Fix readv() and writev() support in pipe and unix-domain socket. · 09e61023

Nadav Har'El authored 11 years ago

The iovec iteration was broken, so both readv() and writev() on pipes
and unix-domain stream sockets didn't work. Fix it.

09e61023

Atomic writes, and long writes, to pipes. · 2a6a0391

Nadav Har'El authored 11 years ago

This patch fixes two behaviors of pipes and unix-domain stream socketpair,
which went against Posix and Linux standards

1. A blocking write() on a pipe needs to return only when the full write -
is finished. It should not just write until the end of the pipe buffer
and return - as we did in the previous code.

This means that a long write() to a pipe can write the data in parts,
waiting between them for a reader to read from the pipe.

2. As explained above, writes will be split into parts (and if there are
multiple writers, get mixed with writes from other writers). But Posix
also guarantees that short writes - up to 4096 bytes (PIPE_BUF==4096
on Linux) - are *atomic*, and not be split up.
In the previous code, if even 1 byte was available on the buffer,
we wrote it. Now, if the write is short, we need to wait until the
entire needed length is available.

2a6a0391

More pipe tests · f4ba833c

Nadav Har'El authored 11 years ago

Test atomic writes, long writes (should block until complete), readv
and writev on pipes. All of these fail at this point, and will be fixed
by the following commits.

f4ba833c