- Jun 23, 2013
-
-
Nadav Har'El authored
-
- Jun 21, 2013
-
-
Guy Zana authored
same as we can tell the host to disable interrupts via the _avail ring, the host can tell us to supress notification via the _used ring. every notificaion, or kick consumes about 10ns as it is implemented as writing to an io port, which travels to usespace qemu in the host. this simple patch, increase netperf's throughput by 600%, from a 300mbps to 1800mbps.
-
Guy Zana authored
-
- Jun 20, 2013
-
-
Dor Laor authored
Indirect is good for very large SG list but isn't required in case there is enough place on the ring or the SG list is tiny. For the time being there is barely use of it so I set it off by default
-
Dor Laor authored
The feature allows the hypervisor to batch several packets together as one large SG list. Once such header is received, the guest rx routine interates over the list and assembles a mega mbuf. The patch also simplifies the rx path by using a single buffer for the virtio data and its header. This shrinks the sg list from size of two into a single one. The issue is that at the moment I haven't seen packets w/ mbuf > 1 being received. Linux guest does receives such packets here and there. It may be due to the use of offload features that enalrge the packet size
-
- Jun 19, 2013
-
-
Guy Zana authored
this rwlock gives precedence to writers, it relies on a mutex and 2 condvars for it's implementation. it also supports taking the lock recursively for both readers and writers. this implementation is not fully tested but yet the TCP stack uses it extensively, so far without any seen races (tested TCPDownload and netperf).
-
Guy Zana authored
1. it is much cleaner that the header files perform extern "C" themselves, so they can be included both from C and C++ code. 2. when doing extern "C" from a C++ file then __cplusplus is also defined, and compilation can break in some situations. 3. as a bonus, this patch increase compilation time.
-
Nadav Har'El authored
netport.h defines a log() macro, which is an unfortunate choice of name because log is also a pretty-well-known mathematical function, and this So rename this macro bsd_log(), and change the dozen files which used log() to use bsd_log().
-
Nadav Har'El authored
Java's os::available() requires the FIONREAD on fds which do not implement seek. So we need to support this ioctl for the console.
-
Nadav Har'El authored
Use the new wake_with() in lock-free mutex
-
Nadav Har'El authored
Before commit 1b53ec56, condvar_wake_all() had a crash which could be seen tst-pipe.so (which apparently tests some condvar code paths that weren't tested by tst-condvar.so). The fix in this commit was for condvar_wait() to regain the condvar internal lock after the wait, even if not needed (it's only really needed in case of timeout). This masked the bug (see details below) but also deteriorated performance: the woken up thread will now often goes back to sleep to wait for the lock which is still held by condvar_wake(). This patch reverts that commit, i.e., condvar_wait() does not retake the lock when woken. Instead, we fix the real bug: The bug was in condvar_wake_all() which did: sched::thread *t = wr->t; wr->t = nullptr; t->wake(); wr = wr->newer; but after the wake(), wr is no longer valid (the waiter, being woken, would quickly exit the condvar_wait() function which held wr on the stack). However, by not taking the lock after the wait we also have another potential for bug - in rare cases merely doing wr->t = nullptr can cause the thread t to start running, and if it not only stops waiting but also exits - the call to t->wake() will refer to an invalid thread and may crash. So we need to use the new wake_with() thread method introduced in a previous patch.
-
Nadav Har'El authored
Added a test for wake_with(). It tries to ensure that the problematic case solved by wake_with() actually happens quickly, by: 1. Spin a long time between the setting of the flag and t->wake() 2. Do a spurious wake() to ensure that the waiting thread is woken up right after setting the flag, before the intended wake. 3. Use mprotect() to ensure that working with an already join()ed thread crashes immediately, instead of just maybe crashing. This test fails when wake_with() doesn't use ref()/unref(), and succeeds with the full wake_with(). tst-wake contains a second test, which does the same thing but without the additional measures we used to show the bug (spinning, spurious wake and mprotect). Without these additional measures the test iteration is much faster, which allows us to stress wake/join much more.
-
Nadav Har'El authored
When we use wait_until(), e.g., wait_until([&] { return *x == 0; }) We used (in a bunch of places in the code, including condvar) the following "obvious" idiom to wake it up: *x = 0; t->wake(); This does the right thing in *almost* all situations. But there's still one rare (but very possible) scenario where this is wrong. The problem is that the first line (*x = 0) may already cause the wait_until to return. This can happen when wait_until didn't yet check the condition, or if it was sleeping and by rare coincidence, got woken up by a spurious interrupt at the same time we did *x = 0. Now, consider the case that the waiting thread decides to exit after the wait_until... So the "*x = 0" causes the thread to exit, and when we want to do "t->wake()" the thread no longer exists, and the statement crashes. This patch adds two new thread methods: t->ref() increments a counter preventing a thread's destruction, until a matching t->unref(). With these methods, the correct way to wake the above wait_until() is: t->ref(); *x = 0; t->wake(); t->unref(); This patch also adds a one-line shortcut to the above 4 lines, with syntax mirroring that of wait_until: t->wake_with([&] { *x = 0; }); The ref()/unref() methods are marked private, to encourage the use of wake_with(), and also to allow wake_with() in the future to be optimized to avoid calling ref()/unref() when not needed. For example, when the thread is on the same CPU as the current thread, merely disabling preemption (a very fast operation) prevents the thread from running - and exiting - and ref()/unref() are not necessary. Unfortunately, while this patch solves one bug, it does not solve two additional bugs that existed before, and continue to exist after this patch: 1. When a thread completes (see thread::complete()) it wakes a thread waiting on join() (if there is one) and this join() deletes the thread and its stack. The problem is that if the timing is right (or wrong ;-)), the joiner thread may delete the stack while complete() is still running on this stack, and can cause a crash. 2. If join() races with the thread's completion, it is possible that the thread thinks nobody is waiting for it so notifies nobody, but at the same time join() starts to wait, and will never be woken up. Added two "FIXME" about these remaining bugs.
-
Nadav Har'El authored
lseek() crashes when used on pipes, sockets, and now also fd 0, 1 or 2 (the console), because they don't have an underlying vnode. No reason to assert() in this case, should just return ESPIPE (like Linux does for pipes, sockets and ttys). Similarly, fsync, readdir and friends, fchdir and fstatfs shouldn't crash if given a fd without a vnode, and rather should return the expected error.
-
Nadav Har'El authored
We had a bug where a read() on the console (fd 0) would block writes to the console (fd 1 or 2). This was most noticable when background threads in the CLI tried to write output, and were blocked until the next keypress because the blocking read() would lock the writes out. The bug happens because we opened the console using open("/dev/console") and dup()'ed the resulting fd, but this results, in the current code, in every read and write to these file descriptors to pass through vfs_read()/ vfs_write(), which lock a single vnode lock for all three file descriptors - leading to write on fd 1 blocking while read is ongoing on fd 0. This patch doesn't fix this vnode lock issue, which remains - and should be fixed when the devfs or vfs layers are rewritten. Instead, this patch adds a *second API* for opening a console which doesn't go through the vnode or devfs layers: A new console::open() function returns a file descriptor which implements the correct file operations, and is not associated with any vnode. The new implementation works well with write() while read() is ongoing. Note that poll() support was missing from the old implementation (it seems it can't be done with the vnode abstraction?) and is still missing in the new implementation, although now shouldn't be hard to add (need to implement the poll fileops, and to use poll_wake() in the line-discipline function console_poll).
-
Avi Kivity authored
tab completion relies on a global 'ls' object, re-add it. Broken by 4bfe157b.
-
Nadav Har'El authored
Sorry, missing unsupported_poll broke compilation after the previous patch
-
Nadav Har'El authored
This is an epoll_*() implementation which calls poll() to do the real work. This is of course a terrible implementation, which makes epoll() less efficient, instead of more efficient, then poll(). However, it allows me to progress with running Jetty in parallel with perfecting epoll.
-
Nadav Har'El authored
It's not clear if our DNS resolver works or not - need to test and fix if needed.
-
Nadav Har'El authored
Trivially implement mtx_assert(). This would catch the "ifconfig" bug fixed in the previous patch - where ifconfig called sofree() without the accept lock.
-
Nadav Har'El authored
ifconfig used to call sofree(), which assumed accept_mtx was taken, which wasn't true, resulting in either an assertion failure (if we implement assert_mtx - see next patch) or a mutex corruption (if assert_mtx does nothing). Instead, we should call soclose(). This wasn't very hard to figure out, given the comment in socreate() saying "The socket should be closed with soclose()." :-)
-
- Jun 18, 2013
-
-
Nadav Har'El authored
This patch turns on the flag which switches all our code to use the lock-free mutex instead of the spinlock-based mutex. It's time we start using the lock-free mutex, which is stable enough by now - but please let me know if you do experience any performance problem, or bugs, related to the new mutex. If you need to disable the new mutex temporarily and return to the old, just change the "#define LOCKFREE_MUTEX" in osv/mutex.h to #undef.
-
Nadav Har'El authored
Returning a void does nothing, and just confusing.
-
Avi Kivity authored
Eclipse recognizes .mk as a makefile, make it easier for new users to use eclipse.
-
Christoph Hellwig authored
-
Nadav Har'El authored
This single Java source file is a full-fledged HTTP 0.9 server. I wanted to add it to expose the console lock bug (fixed in a separate patch), and to verify that bind() works correctly (it does). But additionally, this tiny HTTP server (about 6KB of compressed bytecode) can be very useful for our CLI - it can be run in the background and let you view files in the OSV system in your browser, even while another program is running. To run Shrew from the CLI, just run java com.cloudius.cli.util.Shrew Which runs the HTTP server in the background (in a separate thread), letting the user continue to use the CLI. If you add an argument "fg" to this command, it runs the server in the current thread, never returning. Currently, the HTTP server is written to browse OSV's root directory hierarchy: accessing http://192.168.122.100:8080/ from the host shows you the OSV guest's root directory, and you can decend into more directories and download individual files.
-
Avi Kivity authored
Instead of defining a command object in one file and registering it in another, do everything in one place.
-
Avi Kivity authored
- per-cpu variables - per-cpu kvmclock - tracepoint probe functions - tracepoint Java API - 'perf stat' cli command
-
Avi Kivity authored
Usage: perf list (lists all tracepoints) perf stat tp... (counts tracepoints) Example: [/]$ perf stat mutex_lock ctxsw=sched_switch mutex_unlock wake=sched_wake mutex_lock ctxsw mutex_unlock wake 40 3 1909 2 2075 147 190 82 193 138 193 78 146 139 146 92 317 179 317 78 146 139 146 78 146 139 186 78 205 139 165 78 146 139 146 78 146 139 146 78 146 139 146 80 193 143 193 81 151 147 151 78 146 139 146 78 146 139 146 78 146 139 146 78 159 139 159 78 149 139 149 78 146 139 146 78 164 139 164 78 146 139 176 78 176 139 146 78 149 139 149 78 146 139 146 78 146 139 146 78 mutex_lock ctxsw mutex_unlock wake 146 139 146 79 715 147 715 80 188 139 204 78
-
Christoph Hellwig authored
-
Christoph Hellwig authored
-
Christoph Hellwig authored
-
Christoph Hellwig authored
Don't actually implement it either yet, but at least don't abort.
-
Christoph Hellwig authored
-
Christoph Hellwig authored
-
Christoph Hellwig authored
-
Christoph Hellwig authored
We already get these from our API version of <sys/mount.h>
-
Christoph Hellwig authored
-
Christoph Hellwig authored
-
Christoph Hellwig authored
-