Commits · c6399e552ae40578dbeffe0882fecf0292dcf85d · Verlässliche Systemsoftware / projects / osv

Jun 30, 2013

Stub getpwnam(), setuid() and setgid() · c6399e55

Nadav Har'El authored 11 years ago

Implement getpwname(), setuid() and setgid() in the simplest way possible
considering that we don't support any userid except 0:

getpwname() returns user 0 for any username given to it.
setuid() and setgid() does nothing for uid or gid 0, otherwise fails.
Where would the caller get this !=0 id anyway?

Memcached needs these calls, because it wants to be clever and
warn the user against running it as root....

c6399e55

libc: implement signal() · 1ca1bd43

Nadav Har'El authored 11 years ago

Implement the signal() function. This is hardly a useful function in OSV,
first because our signal support is pretty broken, and second because
sigaction() is a much more portable API that should always be preferred.

Nevertheless, memcached uses signal() (to catch SIGINT, which it will
never get in OSV...), so let's implement it for the sake of completeness.

1ca1bd43

Jun 18, 2013
- build: rename build.mak to build.mk · aac18bc2
  Avi Kivity authored 11 years ago
  
  Eclipse recognizes .mk as a makefile, make it easier for new users to use eclipse.
  aac18bc2
Jun 13, 2013

Implement usleep() · 0be1f9e0

Nadav Har'El authored 11 years ago

usleep() was scrubbed out of POSIX in 2008, and not used in Java, but
it does exist in glibc and is damn easy to use compared to its newer
relative, nanosleep, so I want to use it in a test.

0be1f9e0

shutdown_af_local: add missing locks · 67923f37

Nadav Har'El authored 11 years ago

As Avi pointed out, shutdown_af_local() did read-modify-write to
f->f_flags without locking. Add the missing locks.

67923f37

Jun 12, 2013

Optionally enable (disabled by default) lock-free mutex · a2cb99d5

Nadav Har'El authored 11 years ago

This patch optionally enables, at compile-time, OSV to use the lock-free
mutex instead of the spin-lock-based mutex. To use the lock-free mutex,
change the line "#undef LOCKFREE_MUTEX" in include/osv/mutex.h to
"#define LOCKFREE_MUTEX".

LOCKFREE_MUTEX is currently disabled by default, awaiting a few more
tests, but at this point I'm happy to say that beyond one known
unrelated bug (see details below), it seems the lock-free mutex is
fairly stable, and survives all tests and benchmarks I threw at it.

The remaining known bug involves a thread destruction race between
complete() and join(): complete wake()s the joiner thread, which in
rare cases can really quickly delete the thread's stack, before wake()
returns, causing a crash on return from wake(). This bug is really
unrelated to the lock-free mutex, but for some unknown reason I can
only reproduce it with the lock-free mutex on the SPECjvm2008 "sunflow"
benchmark.

To make lockfree::mutex our default mutex, this patch does the following
when LOCKFREE_MUTEX is defined:

1. In core/mutex.cc, #ifndef away out the old mutex code, leaving the
   spinlock code in case someone wants to use it directly.

2. In include/osv/mutex.h, do different things in C++ and C (remember that
   lockfree::mutex is a C++ class, and cannot be used directly from C):

   * In C++, simply make mutex and mutex_t aliases for lockfree::mutex.

   * In C, make struct mutex and mutex_t an opaque 40-byte structure (in
     C++ compilation, we verify that this 40 is indeed the C++ class's
     length), and define the operations on it.

3. In libc/pthread.cc, if LOCKFREE_MUTEX, unfortunately the new mutex
   will not fit into pthread_mutex_t, and neither will condvar fit now
   into pthread_cond_t. So use a lazily allocated mutex or condvar, using
   the lazy_indirect<> template.

a2cb99d5

Jun 10, 2013

libc: optimized memcpy() · 06dd5386

Avi Kivity authored 11 years ago

If the cpu supports "Enhanced REP MOVS / STOS" (ERMS), use an rep movsb
instruction to implement memcpy.  This speeds up copies significantly,
especially large misaligned ones.

06dd5386

Jun 09, 2013

Implement shutdown() on unix domain sockets · 30f6e9dd

Nadav Har'El authored 11 years ago

The existing shutdown() code only worked with AF_INET sockets, and returned
ENOTSOCK for AT_LOCAL sockets, because we implemented the latter sockets in
completely different code (in af_local.cc).

So in uipc_syscalls_wrap.c, the same place we call a the special af-local
socketpair(), we also need to call the special af-local shutdown().

The way we do it is a bit ugly, but effective: shutdown() first calls
shutdown_af_local(), and if that returns ENOTSOCK (so it's not an af_local
socket), we continue trying the regular socket shutdown code.

A better way would have been to add shutdown() to the fileops table -
either the generic one (why not?), or invent a new mechanism whereby
certain file types (in this case, "sockets" of all types) can have additional
ops tables including in this case a shutdown() operation. Linux has
something of this sort for implementing shutdown().

30f6e9dd

Jun 04, 2013

Nonblocking pipes · 1fa558ef

Nadav Har'El authored 11 years ago

This patch adds support for O_NONBLOCK on pipes and unix domain sockets.

Java's EPollSelectorImpl uses a pipe to interrupt a sleeping poll, and it,
quite understandably, sets them to non-blocking (if you only write a
single byte to a pipe, you don't expect any blocking anyway).
So we can't croak if this option is used, and better just implement it
correctly.

1fa558ef

Jun 02, 2013

Split af_local.cc into four files · 729bdbd5

Nadav Har'El authored 11 years ago

The source file af_local.cc implemented both pipes and bi-directional
pipes (unix domain stream socketpair), using a common buffer implemetation.

As suggested by Guy, split this file into four files:

pipe_buffer.cc and pipe_buffer.hh contain the common buffer implementation,
class pipe_buffer. Since this buffer basically implements a single-direction
pipe, I renamed it from "af_local_buffer" to pipe_buffer.

af_local.cc now contains just the unix domain stream socketpair
implementation, implemented using two pipe_buffer objects.

af_pipe.cc contains the Posix pipe() implementation, implemented using
one pipe_buffer object..

729bdbd5

Fix readv() and writev() support in pipe and unix-domain socket. · 09e61023

Nadav Har'El authored 11 years ago

The iovec iteration was broken, so both readv() and writev() on pipes
and unix-domain stream sockets didn't work. Fix it.

09e61023

Atomic writes, and long writes, to pipes. · 2a6a0391

Nadav Har'El authored 11 years ago

This patch fixes two behaviors of pipes and unix-domain stream socketpair,
which went against Posix and Linux standards

1. A blocking write() on a pipe needs to return only when the full write -
is finished. It should not just write until the end of the pipe buffer
and return - as we did in the previous code.

This means that a long write() to a pipe can write the data in parts,
waiting between them for a reader to read from the pipe.

2. As explained above, writes will be split into parts (and if there are
multiple writers, get mixed with writes from other writers). But Posix
also guarantees that short writes - up to 4096 bytes (PIPE_BUF==4096
on Linux) - are *atomic*, and not be split up.
In the previous code, if even 1 byte was available on the buffer,
we wrote it. Now, if the write is short, we need to wait until the
entire needed length is available.

2a6a0391

Abort if unsupported O_NONBLOCK used on unix-domain socket or pipe. · b0c593aa

Nadav Har'El authored 11 years ago

O_NONBLOCK is not yet supported in our implementation of unix-domain
sockets or pipes, so until it is, abort() if it is used, instead of
silently ignoring this mode and doing something very different from
what the application expected.

b0c593aa

May 31, 2013

pipe: bug fix used logical and (&&) instead of bitwise and (&) · e2b4501a

Guy Zana authored 11 years ago

also, it's better to call poll_wake() with both POLLIN/OUT and
POLLRDNORM/POLLWRNORM, just to be on the safe side. seen a few
references in the jdk.

e2b4501a

May 30, 2013

Add pipe() · 8ef91f0d

Nadav Har'El authored 11 years ago

This patch adds pipe(). The pipes are built using the same FIFO implementation,
"af_local_buffer", as used by the existing unix-domain socketpair
implementation - while the socket-pair used two of these buffers, a pipe
uses one.

This implementation deviates from traditional POSIX pipe behavior in two
ways that we should fix in followup-patches:

1. SIGPIPE is not supported: A write to a pipe whose read end is closed
will always return EPIPE, and not generate a SIGPIPE signal.
Programs that rely on SIGPIPE will break, but SIGPIPE is completely out
of fashion, and normally ignored.

2. Unix-style "atomic writes" are not obeyed. A write(), even if smaller
than PIPE_BUF (=4096 on Linux, whose ABI we're emulating), may partially
succeed if the pipe's buffer is nearly full. Only a write() of a single
byte is guaranteed to be atomic.

We hope that Java doesn't rely on multi-byte write() atomicity
(single-byte writes are enough for waking poll, for example), and users
of Java's "Pipe" class definitely can't (as Java is not Posix-only),
so we hope this will not cause problems. Fixing this issue (which is easy)
is left as a TODO in the code.

Additionally, this patch marks with a FIXME (but doesn't fix) a serious
bug in the code's iovec handling, so writev() and readv() are expected
not to work in this version of pipe() - and also on the existing socketpair.

8ef91f0d

Move unsupported fileops to fs/unsupported.c · 5062ff4f

Nadav Har'El authored 11 years ago

Previously, we re-implemented "unsupported" file operations (e.g., chmod
for a pipe on which fchmod makes no sense) several times - there was
an implementation only for chmod in kern_descrip.c, used in sys_socket.c,
and af_local.cc had its own. As we add more file descriptor type (e.g.,
create_epoll()) we'll have even more copies of these silly functions, so
let's do it once in fs/unsupported.c - with the fs/unsupported.h header
file.

This also gives us a central place to document (and argue) whether an
unimplemented ioctl() should return ENOTTY or EBADF (I think the former).

5062ff4f

Fix waiting poll on unix-domain socketpair · c58c7aac

Nadav Har'El authored 11 years ago

If poll() was waiting on a file descriptor from socketpair_af_local, we
would never wake it up, and an example of this is the failure in a
recently committed fix to tst-af-local.cc.

The problem is that when one writes to one end of the socket, we need to
call wake_poll() on the other end of the socket, so we need to remember
which "struct file *" is attached to each end of the af_local_buffer objects.

What I did is what I thought the most elegant solution is:

Rather than having "sender" and "receiver" of af_local_buffer booleans,
they are now "struct file *". I added new functions, attach_sender(f) and
attach_receiver(f), which set the file* we'll need to notify for each
end; These functions are analogous to functions detach_sender, detach_receiver
that we already had.

After each interesting event - read, write, close, etc - we notify the
appropriate file*, using poll_wake.

attach_sender(f) and attach_receiver(f) is called by af_local_init(f) - which
used to be empty and now does something. Note how af_local_init(f) only
does send->attach_sender(f) and receive->attach_receiver(f), but doesn't
touch the two others (send->attach_receiver, receive->attach_sender) -
these other two are set when the second file descriptor, with the send
and receive fifos in reversed roles, is initialized with its af_local_init.

After this fix, the new af_local_test works correctly.

c58c7aac

May 29, 2013

Implement missing readdir64() as alias to readdir() · 61e295f2

Nadav Har'El authored 11 years ago

This patch implements readdir64, as an alias to readdir. We can do this,
because on 64-bit Linux, even the ordinary struct dirent uses 64-bit
sizes, so the structures are identical.

The reason we didn't miss this function earlier is that reasonable
applications prefer to use readdir64_r, not readdir64. Because Boost
filesystem library thought we don't have the former (see next patch
for fixing this), it used the latter.

61e295f2

May 27, 2013

provide a utsname structure · 43c3f6dd

Christoph Hellwig authored 11 years ago

ZFS wants direct access to a global utsname structure. Provide one from
core OSv code and rewrite uname to just copy it out. To ease this move
the uname implementation to a C file as this allows using designated
initializers and avoids the casting mess around memcpy.

43c3f6dd

May 26, 2013
- libc: fix strerror_r, should not appear as __xpg_strerror_r() · c481b94d
  Guy Zana authored 11 years ago
  
  strerror_r is needed by the JVM in order to print errors correctly.
  c481b94d
May 22, 2013
- libc: use indirection for accessing the sempahore implementation · 39daaed5
  Avi Kivity authored 11 years ago
  
  Rather than restricting our semaphore's implementation to be smaller than glibc's, use indirection to only store a pointer in the user's structure.
  39daaed5
- semaphore: extract generic semaphore from pthread semaphore implementation · 747ff478
  Avi Kivity authored 11 years ago
  
  No code changes.
  747ff478
May 20, 2013

pthread: drop 'pmutex' · 83745222

Avi Kivity authored 11 years ago

We had a klugey pmutex class used to allow zero initialization of
pthread_mutex_t.  Now that the mutex class supports it natively we
can drop it.

83745222

pthread: drop pthread's zombie reaper · e25fc7e7

Avi Kivity authored 11 years ago

Use the generic one instead; the cleanup function allows destroying
the pthread object.

e25fc7e7

Replace backtrace() implementation with one using libunwind · 53c7ade5

Nadav Har'El authored 11 years ago

The previous implementation of backtrace() required frame pointers.
This meant it could only be used in the "debug" build, but worse,
it also got confused by libstdc++ (which was built without frame pointers),
leading to incorrect stack traces, and more rarely, crashes.

This changes backtrace() to use libunwind instead, which works even
without frame pointers. To satisfy the link dependencies, libgcc_eh.a
needs to be linked *after* libunwind.a. Because we also need it linked
*before* for other reasons, we end up with libgcc_eh.a twice on the
linker's command line. The horror...

53c7ade5

Add partial implementation of msync() for libunwind · de374193

Nadav Har'El authored 11 years ago

libunwind, which the next patches will use to implement a more reliable
backtrace(), needs the msync() function. It doesn't need it to actually
sync anything - just to recognize valid frame addresses (stacks are
always mmap()ed).

Note this implementation does the checking, but is missing the "sync" part
of msync ;-) It doesn't matter because:

1. libunwind doesn't need (or want) this syncing, and neither does anything
else in the Java stack (until now, msync() was never used).

2. We don't (yet?) have write-back of mmap'ed memory anyway, so there's
no sense in doing any writing in msync either. We'll need to work on
a full read-write implementation of file-backed mmap() later.

de374193

May 18, 2013
- realpath.c needs <limits.h> now · dbae073f
  Christoph Hellwig authored 11 years ago
  
  dbae073f
- libc: remove duplicate TZNAME_MAX definition · ceab9202
  Avi Kivity authored 11 years ago
  
  ceab9202
- libc: extern "C" gettimeofday() · 2a7486dd
  Avi Kivity authored 11 years ago
  
  Probably unneeded with the prototype fix, but can't hurt.
  2a7486dd
- libc: remove duplicate definitions · 1ee1a237
  Avi Kivity authored 11 years ago
  
  musl provides those, so use their definitions.
  1ee1a237
- libc: add missing pthread.h include · 39c2522c
  Avi Kivity authored 11 years ago
  
  39c2522c
- libc: fix up loadavg() prototype · e7aff417
  Avi Kivity authored 11 years ago
  
  Not provided by musl, so mark as extern "C".
  e7aff417
- libc: make locale implementation match glibc's · 6db6cc59
  Avi Kivity authored 11 years ago
  
  6db6cc59
- dlfcn: fix up function definitions · 8844ac96
  Avi Kivity authored 11 years ago
  
  add extern "C" where needed, fix up prototypes.
  8844ac96
- libc: define __DIR_s · 61f9db1e
  Avi Kivity authored 11 years ago
  
  61f9db1e
May 16, 2013

Implement sigismember() · 191eaf08

Nadav Har'El authored 11 years ago

This function happens to be used by Java's "-Xcheck:jni", and is
trivial to implement, so why not...

191eaf08

May 10, 2013
- remove .rej and .orig files from the last musl upgrade · bec1efd1
  Christoph Hellwig authored 11 years ago
  
  bec1efd1
May 07, 2013
- pthread: honor pthread_attr_setguardsize() · b74fc0a0
  Avi Kivity authored 11 years ago
  
  Record the guard size in the thread attributes, and mprotect() that bit so the user cannot overflow the stack.
  b74fc0a0
- pthread: honor pthread stack size · 38f14171
  Avi Kivity authored 11 years ago
  
  Record stack size in pthread_attr_setstacksize(), and use it when creating the thread.
  38f14171
- pthread: use scheduler stack deleter facility for destroying the stack · 3bd08741
  Avi Kivity authored 11 years ago
  
  3bd08741