- Jun 30, 2013
-
-
Nadav Har'El authored
Implement getpwname(), setuid() and setgid() in the simplest way possible considering that we don't support any userid except 0: getpwname() returns user 0 for any username given to it. setuid() and setgid() does nothing for uid or gid 0, otherwise fails. Where would the caller get this !=0 id anyway? Memcached needs these calls, because it wants to be clever and warn the user against running it as root....
-
Nadav Har'El authored
Implement the signal() function. This is hardly a useful function in OSV, first because our signal support is pretty broken, and second because sigaction() is a much more portable API that should always be preferred. Nevertheless, memcached uses signal() (to catch SIGINT, which it will never get in OSV...), so let's implement it for the sake of completeness.
-
- Jun 18, 2013
-
-
Avi Kivity authored
Eclipse recognizes .mk as a makefile, make it easier for new users to use eclipse.
-
- Jun 13, 2013
-
-
Nadav Har'El authored
usleep() was scrubbed out of POSIX in 2008, and not used in Java, but it does exist in glibc and is damn easy to use compared to its newer relative, nanosleep, so I want to use it in a test.
-
Nadav Har'El authored
As Avi pointed out, shutdown_af_local() did read-modify-write to f->f_flags without locking. Add the missing locks.
-
- Jun 12, 2013
-
-
Nadav Har'El authored
This patch optionally enables, at compile-time, OSV to use the lock-free mutex instead of the spin-lock-based mutex. To use the lock-free mutex, change the line "#undef LOCKFREE_MUTEX" in include/osv/mutex.h to "#define LOCKFREE_MUTEX". LOCKFREE_MUTEX is currently disabled by default, awaiting a few more tests, but at this point I'm happy to say that beyond one known unrelated bug (see details below), it seems the lock-free mutex is fairly stable, and survives all tests and benchmarks I threw at it. The remaining known bug involves a thread destruction race between complete() and join(): complete wake()s the joiner thread, which in rare cases can really quickly delete the thread's stack, before wake() returns, causing a crash on return from wake(). This bug is really unrelated to the lock-free mutex, but for some unknown reason I can only reproduce it with the lock-free mutex on the SPECjvm2008 "sunflow" benchmark. To make lockfree::mutex our default mutex, this patch does the following when LOCKFREE_MUTEX is defined: 1. In core/mutex.cc, #ifndef away out the old mutex code, leaving the spinlock code in case someone wants to use it directly. 2. In include/osv/mutex.h, do different things in C++ and C (remember that lockfree::mutex is a C++ class, and cannot be used directly from C): * In C++, simply make mutex and mutex_t aliases for lockfree::mutex. * In C, make struct mutex and mutex_t an opaque 40-byte structure (in C++ compilation, we verify that this 40 is indeed the C++ class's length), and define the operations on it. 3. In libc/pthread.cc, if LOCKFREE_MUTEX, unfortunately the new mutex will not fit into pthread_mutex_t, and neither will condvar fit now into pthread_cond_t. So use a lazily allocated mutex or condvar, using the lazy_indirect<> template.
-
- Jun 10, 2013
-
-
Avi Kivity authored
If the cpu supports "Enhanced REP MOVS / STOS" (ERMS), use an rep movsb instruction to implement memcpy. This speeds up copies significantly, especially large misaligned ones.
-
- Jun 09, 2013
-
-
Nadav Har'El authored
The existing shutdown() code only worked with AF_INET sockets, and returned ENOTSOCK for AT_LOCAL sockets, because we implemented the latter sockets in completely different code (in af_local.cc). So in uipc_syscalls_wrap.c, the same place we call a the special af-local socketpair(), we also need to call the special af-local shutdown(). The way we do it is a bit ugly, but effective: shutdown() first calls shutdown_af_local(), and if that returns ENOTSOCK (so it's not an af_local socket), we continue trying the regular socket shutdown code. A better way would have been to add shutdown() to the fileops table - either the generic one (why not?), or invent a new mechanism whereby certain file types (in this case, "sockets" of all types) can have additional ops tables including in this case a shutdown() operation. Linux has something of this sort for implementing shutdown().
-
- Jun 04, 2013
-
-
Nadav Har'El authored
This patch adds support for O_NONBLOCK on pipes and unix domain sockets. Java's EPollSelectorImpl uses a pipe to interrupt a sleeping poll, and it, quite understandably, sets them to non-blocking (if you only write a single byte to a pipe, you don't expect any blocking anyway). So we can't croak if this option is used, and better just implement it correctly.
-
- Jun 02, 2013
-
-
Nadav Har'El authored
The source file af_local.cc implemented both pipes and bi-directional pipes (unix domain stream socketpair), using a common buffer implemetation. As suggested by Guy, split this file into four files: pipe_buffer.cc and pipe_buffer.hh contain the common buffer implementation, class pipe_buffer. Since this buffer basically implements a single-direction pipe, I renamed it from "af_local_buffer" to pipe_buffer. af_local.cc now contains just the unix domain stream socketpair implementation, implemented using two pipe_buffer objects. af_pipe.cc contains the Posix pipe() implementation, implemented using one pipe_buffer object..
-
Nadav Har'El authored
The iovec iteration was broken, so both readv() and writev() on pipes and unix-domain stream sockets didn't work. Fix it.
-
Nadav Har'El authored
This patch fixes two behaviors of pipes and unix-domain stream socketpair, which went against Posix and Linux standards 1. A blocking write() on a pipe needs to return only when the full write - is finished. It should not just write until the end of the pipe buffer and return - as we did in the previous code. This means that a long write() to a pipe can write the data in parts, waiting between them for a reader to read from the pipe. 2. As explained above, writes will be split into parts (and if there are multiple writers, get mixed with writes from other writers). But Posix also guarantees that short writes - up to 4096 bytes (PIPE_BUF==4096 on Linux) - are *atomic*, and not be split up. In the previous code, if even 1 byte was available on the buffer, we wrote it. Now, if the write is short, we need to wait until the entire needed length is available.
-
Nadav Har'El authored
O_NONBLOCK is not yet supported in our implementation of unix-domain sockets or pipes, so until it is, abort() if it is used, instead of silently ignoring this mode and doing something very different from what the application expected.
-
- May 31, 2013
-
-
Guy Zana authored
also, it's better to call poll_wake() with both POLLIN/OUT and POLLRDNORM/POLLWRNORM, just to be on the safe side. seen a few references in the jdk.
-
- May 30, 2013
-
-
Nadav Har'El authored
This patch adds pipe(). The pipes are built using the same FIFO implementation, "af_local_buffer", as used by the existing unix-domain socketpair implementation - while the socket-pair used two of these buffers, a pipe uses one. This implementation deviates from traditional POSIX pipe behavior in two ways that we should fix in followup-patches: 1. SIGPIPE is not supported: A write to a pipe whose read end is closed will always return EPIPE, and not generate a SIGPIPE signal. Programs that rely on SIGPIPE will break, but SIGPIPE is completely out of fashion, and normally ignored. 2. Unix-style "atomic writes" are not obeyed. A write(), even if smaller than PIPE_BUF (=4096 on Linux, whose ABI we're emulating), may partially succeed if the pipe's buffer is nearly full. Only a write() of a single byte is guaranteed to be atomic. We hope that Java doesn't rely on multi-byte write() atomicity (single-byte writes are enough for waking poll, for example), and users of Java's "Pipe" class definitely can't (as Java is not Posix-only), so we hope this will not cause problems. Fixing this issue (which is easy) is left as a TODO in the code. Additionally, this patch marks with a FIXME (but doesn't fix) a serious bug in the code's iovec handling, so writev() and readv() are expected not to work in this version of pipe() - and also on the existing socketpair.
-
Nadav Har'El authored
Previously, we re-implemented "unsupported" file operations (e.g., chmod for a pipe on which fchmod makes no sense) several times - there was an implementation only for chmod in kern_descrip.c, used in sys_socket.c, and af_local.cc had its own. As we add more file descriptor type (e.g., create_epoll()) we'll have even more copies of these silly functions, so let's do it once in fs/unsupported.c - with the fs/unsupported.h header file. This also gives us a central place to document (and argue) whether an unimplemented ioctl() should return ENOTTY or EBADF (I think the former).
-
Nadav Har'El authored
If poll() was waiting on a file descriptor from socketpair_af_local, we would never wake it up, and an example of this is the failure in a recently committed fix to tst-af-local.cc. The problem is that when one writes to one end of the socket, we need to call wake_poll() on the other end of the socket, so we need to remember which "struct file *" is attached to each end of the af_local_buffer objects. What I did is what I thought the most elegant solution is: Rather than having "sender" and "receiver" of af_local_buffer booleans, they are now "struct file *". I added new functions, attach_sender(f) and attach_receiver(f), which set the file* we'll need to notify for each end; These functions are analogous to functions detach_sender, detach_receiver that we already had. After each interesting event - read, write, close, etc - we notify the appropriate file*, using poll_wake. attach_sender(f) and attach_receiver(f) is called by af_local_init(f) - which used to be empty and now does something. Note how af_local_init(f) only does send->attach_sender(f) and receive->attach_receiver(f), but doesn't touch the two others (send->attach_receiver, receive->attach_sender) - these other two are set when the second file descriptor, with the send and receive fifos in reversed roles, is initialized with its af_local_init. After this fix, the new af_local_test works correctly.
-
- May 29, 2013
-
-
Nadav Har'El authored
This patch implements readdir64, as an alias to readdir. We can do this, because on 64-bit Linux, even the ordinary struct dirent uses 64-bit sizes, so the structures are identical. The reason we didn't miss this function earlier is that reasonable applications prefer to use readdir64_r, not readdir64. Because Boost filesystem library thought we don't have the former (see next patch for fixing this), it used the latter.
-
- May 27, 2013
-
-
Christoph Hellwig authored
ZFS wants direct access to a global utsname structure. Provide one from core OSv code and rewrite uname to just copy it out. To ease this move the uname implementation to a C file as this allows using designated initializers and avoids the casting mess around memcpy.
-
- May 26, 2013
-
-
Guy Zana authored
strerror_r is needed by the JVM in order to print errors correctly.
-
- May 22, 2013
-
-
Avi Kivity authored
Rather than restricting our semaphore's implementation to be smaller than glibc's, use indirection to only store a pointer in the user's structure.
-
Avi Kivity authored
No code changes.
-
- May 20, 2013
-
-
Avi Kivity authored
We had a klugey pmutex class used to allow zero initialization of pthread_mutex_t. Now that the mutex class supports it natively we can drop it.
-
Avi Kivity authored
Use the generic one instead; the cleanup function allows destroying the pthread object.
-
Nadav Har'El authored
The previous implementation of backtrace() required frame pointers. This meant it could only be used in the "debug" build, but worse, it also got confused by libstdc++ (which was built without frame pointers), leading to incorrect stack traces, and more rarely, crashes. This changes backtrace() to use libunwind instead, which works even without frame pointers. To satisfy the link dependencies, libgcc_eh.a needs to be linked *after* libunwind.a. Because we also need it linked *before* for other reasons, we end up with libgcc_eh.a twice on the linker's command line. The horror...
-
Nadav Har'El authored
libunwind, which the next patches will use to implement a more reliable backtrace(), needs the msync() function. It doesn't need it to actually sync anything - just to recognize valid frame addresses (stacks are always mmap()ed). Note this implementation does the checking, but is missing the "sync" part of msync ;-) It doesn't matter because: 1. libunwind doesn't need (or want) this syncing, and neither does anything else in the Java stack (until now, msync() was never used). 2. We don't (yet?) have write-back of mmap'ed memory anyway, so there's no sense in doing any writing in msync either. We'll need to work on a full read-write implementation of file-backed mmap() later.
-
- May 18, 2013
-
-
Christoph Hellwig authored
-
Avi Kivity authored
-
Avi Kivity authored
Probably unneeded with the prototype fix, but can't hurt.
-
Avi Kivity authored
musl provides those, so use their definitions.
-
Avi Kivity authored
-
Avi Kivity authored
Not provided by musl, so mark as extern "C".
-
Avi Kivity authored
-
Avi Kivity authored
add extern "C" where needed, fix up prototypes.
-
Avi Kivity authored
-
- May 16, 2013
-
-
Nadav Har'El authored
This function happens to be used by Java's "-Xcheck:jni", and is trivial to implement, so why not...
-
- May 10, 2013
-
-
Christoph Hellwig authored
-
- May 07, 2013
-
-
Avi Kivity authored
Record the guard size in the thread attributes, and mprotect() that bit so the user cannot overflow the stack.
-
Avi Kivity authored
Record stack size in pthread_attr_setstacksize(), and use it when creating the thread.
-
Avi Kivity authored
-