Commits · 4245fe5cde7411d39b24cdd147e30726eb0f5d68 · Verlässliche Systemsoftware / projects / osv

Jul 28, 2013
- ioapic: switch to new with_lock() · 35aa5c63
  Avi Kivity authored 11 years ago
  
  35aa5c63
Jul 27, 2013

Glauber Costa authored 11 years ago

Because this is arch specific, I am adding it to a newly created file in
arch/x64. I am making it available to BSD through netport for the lack of a
better place

7445c1e0

Jul 18, 2013
- Reorganize startup order · 5984eb5d
  Avi Kivity authored 11 years ago
  
  Make the early allocator available earlier to support the dynamic per-cpu allocator.
  5984eb5d
- sched: make cpu::current() not depend on the current thread · 7f7df848
  Avi Kivity authored 11 years ago
  
  Depending on the current thread causes a circular dependency with later patches. Use a per-thread variable instead, which is maintained on migrations similarly to percpu_base. A small speedup is a nice side effect.
  7f7df848
- memory: move page allocation functions to its own header · 7a4cf22f
  Avi Kivity authored 11 years ago
  
  Avoid a #include loop with later patches.
  7a4cf22f
- x64: add an optimized memset() implementation · 20b15e78
  Avi Kivity authored 11 years ago
  
  20b15e78
Jul 11, 2013

xen: massage xen.cc · 81f42426

Glauber Costa authored 11 years ago

Since will now have the xen interface files for BSD anyway, let's
use their more readable definitions instead of hardcoded numbers and duplicated
strings.

81f42426

delete xen buggy hypercall · 5da2d022

Glauber Costa authored 11 years ago

This was badly adapted from Avi's 5 argument hypercall. It is not in use for now,
so let's just delete it.

5da2d022

Jul 08, 2013

percpu: speed up percpu base address calculations · 571d39dd

Avi Kivity authored 11 years ago

Currently, we look up the current thread, then the current cpu, then the cpu
id, then the percpu base (through a vector). This is slow.

Speed this up by storing the percpu base in a thread-local variable; this
variable is updated when the thread is started or migrated.

571d39dd

mempool: convert the memory allocator to be per-cpu · 3d4653e7

Guy Zana authored 11 years ago

The new code paritions the free list of pages of each pool to be
per cpu, allocations and deallocations are done locklessly.

uses worker items to avoid a problem where free() for a buffer can
be done from a cpu that is different than the one which allocated
that buffer, we use N^2 rings which are used for communicating
between the threads and worker items. The worker item will actually
do the free() for a buffer from the same cpu it was allocated on.

3d4653e7

percpu: map percpu variables to the same static elf section · f13e49c3

Guy Zana authored 11 years ago

we don't want to use malloc() in the percpu framework since the allocator
itself will have a percpu design.

f13e49c3

mmu: handle static variable addresses in virt_to_phys() and phys_to_virt() · 96ee87ba

Guy Zana authored 11 years ago

the elf is mapped in a 1:1 fashion, so this patch allows addresses
of static variables to be translated as well (needed for next patch).

96ee87ba

pcpu-worker: add a per cpu worker thread that can execute work items · 45e40421

Guy Zana authored 11 years ago

simply allows setting up and execution of a handler in the context of
a specified CPU, the handler is defined staticly in compile time, and
is invoked when the worker_item is signaled for a specied CPU.

doesn't use locks to avoid unnecessary contention.

needed for the per-cpu memory allocator, instead of creating additional
n threads (per each cpu), the plan is to define and register a simple
handler (lambda function).

example of usage:

void say_hello()
{
    debug("Hello, world!");
}

// define hello_tester as a worker_item
PCPU_WORKERITEM(hello_tester, [] { say_hello(); });

.
.
.

// anywhere in the code:
hello_tester.signal(sched::cpus[1]);

// will invoke say_hello() in the context of cpu 1

Thanks to Avi for adding code that I was able to copy & paste :)

45e40421

arch: add CACHELINE_ALIGNED macro · 2f6cf02f
Guy Zana authored 11 years ago

2f6cf02f

Jul 01, 2013

sched: set up thread::current() for new threads earlier · 667e1c10

Avi Kivity authored 11 years ago

Currently, current() is set during the thread initialization sequence,
which means that preemption prior to that point will see the wrong current().
There's an irq_enable() there, but it's not very effective since interrupts
are only disabled in that place during early smp bringup, and it's not trivial
to disable interrupts for all new threads (we?

667e1c10

sched: late initialize idle thread · 802d3633

Avi Kivity authored 11 years ago

Currenly we initialize the idle thread in the cpu's constructor, which leads
to a cycle, since a thread's initialization needs the cpu. This works out
somehow now, but is fragile and will break with succeeding patches.

Defer idle thread initialization to a later stage.

802d3633

sched: start with preemption disabled · 695375f6

Avi Kivity authored 11 years ago

Usually we don't care if threads are started with preemption enabled or
disabled, since interrupts are disabled and no preemption can take place
during thread startup.  However during system bringup we want to avoid
calls into the scheduler while it is being initialized due to stray
preempt_enable() calls.

Do this by initializing preempt_counter to 1, and dropping it during
thread start-up.

695375f6

x64: add a separate interrupt stack · b81a47a5

Avi Kivity authored 11 years ago

This makes it easier to debug faults that happen in interrupts (e.g. in the
scheduler)

Note that this means we cannot schedule within an exception (e.g. a page
fault) for now, since the exception stack is per-cpu while the interrupt
stack is per-thread (which allow scheduling). If/when we implement
scheduling within the page fault handler, we'll either need to trampoline
to the interrupt stack, or have a per-thread exception stack.

b81a47a5

Jun 26, 2013

xen: implement paravirtual clock driver for xen · 70583a58

Glauber Costa authored 11 years ago

Unlike KVM, we won't use percpu variables because Xen already lays down
statically the shared info structure, that includes the vcpu info pointer
for each cpu.

We could in theory use percpu variables to store pointers to the current cpu
vcpu info, but I ended up giving up this route. Since our pcpu implementation
have the overhead of computing addresses anyway, we may as well pay the price
and compute it directly from the xen shared info.

One of the things that comes with it, is that we can compute precise timings
using xenclock very early. Since we don't have *that* much to do early, it is
unclear if KVM needs to be improved in this regard (probably not), so this
becomes just a slight bonus.

70583a58

Jun 25, 2013

xen: negotiate usage of xen pci · 88d39cb7

Glauber Costa authored 11 years ago

Xen defines a protocol for defining whether or not PV drivers are available in
an HVM guest. Upon successful negotiation, the documentation states that:

"The relevant emulated devices then disappear from the relevant buses. For
most guest operating systems, you want to do this before device enumeration
happens."

This patch basically follows this protocol and stores the result for future usage.

See more at: docs/misc/hvm-emulated-unplug.markdown

88d39cb7

buildfix: pvclock xen definitions · a8b8cf06
Glauber Costa authored 11 years ago

a8b8cf06

xen: lay down detection basic infrastructure · 0c8f8f89

Glauber Costa authored 11 years ago


Xen's information can be in a variety of MSRs. We need to test for them all
and figure out in which of them lays the informations we want.

Once we determine that, xen initalization code is ready to be executed.
This needs to run as early as possible, because all xen drivers will
make use of it one way or another.

The hypercall code is heavily inspired (aka mostly copied) from Avi's
xen tentative patch, with the 5-argument hypercall removed (delayed until
we need it)

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>

0c8f8f89

loader: skip reading disk on errors · 9fe8d677

Glauber Costa authored 11 years ago

This is not a very serious issue, but goes like this: The very simple read
method we are attempting right now in the loader, will keep reading from the
disk until we reach a pre-determined max size. However, the disk is usually
smaller than this. If this is the case, XEN dmesg logs are filled with messages
indicated that we are trying to read from invalid LBAs, to the point of making
the log useless for me.

So although the annoyance is minor, the patch itself is minor too. If nobody
opposes, I can apply it.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>

9fe8d677

Jun 17, 2013

percpu: per-cpu variables · 63ab89b6

Avi Kivity authored 11 years ago

Per-cpu variables can be used from contexts where preemption is disabled
(such as interrupts) or when migration is impossible (pinned threads) for
managing data that is replicated for each cpu.

The API is a smart pointer to a variable which resolves to the object's
location on the current cpu.

Define:

   #include <osv/percpu.hh>

   PERCPU(int, my_counter);
   PERCPU(foo, my_foo);

Use:

   ++*my_counter;
   my_foo->member = 7;

63ab89b6

Jun 16, 2013
- sched: initialize cpu::id earlier · e822050c
  Avi Kivity authored 11 years ago
  
  Make it usable in the constructor, for percpu initialization.
  e822050c
Jun 12, 2013

x64: provide some uninstrumented versions of irq flag manipulation functions · f7af76ee

Avi Kivity authored 11 years ago

In the tracer, we don't want interrupt manipulation to cause recursion, so
provide uninstrumented versions of select functions.

f7af76ee

Jun 11, 2013

x64: prevent nested exceptions from corrupting the stack · 1dbddc44

Avi Kivity authored 11 years ago

Due to the need to handle the x64 red zone, we use a separate stack for
exceptions via the IST mechanism. This means that a nested exception will
reuse the parent exception's stack, corrupting it. It is usually very hard
to figure out the root cause when this happens.

Prevent this by setting up a separate stack for nested exceptions, and
aborting immediately if a nested exception happens.

1dbddc44

Jun 10, 2013

x64: switch ifunc resolvers to processor::features() · 48323f23
Avi Kivity authored 11 years ago
```
Now that processor::features() is initialized early enough, we can use it
in ifunc dispatchers.
```
48323f23

x64: make processor::features usable early on · 7fb119b0

Avi Kivity authored 11 years ago

cpuid is useful for ifunc-dispatched functions (like memcpy), so we can
select the correct function based on available processor features.  Make
processor::features available early to support this.

We use a static function-local variable to ensure it is initialized early
enough.

7fb119b0

libc: optimized memcpy() · 06dd5386

Avi Kivity authored 11 years ago

If the cpu supports "Enhanced REP MOVS / STOS" (ERMS), use an rep movsb
instruction to implement memcpy.  This speeds up copies significantly,
especially large misaligned ones.

06dd5386

Jun 09, 2013

Add, and use, new abort(msg) function · e6208f1e

Nadav Har'El authored 11 years ago

Recently Guy fixed abort() so it will *really* not infinitely recurse trying
to print a message, using a lock, causing a new abort, ad infinitum.

Unfortunately, that didn't fix one remaining case: DUMMY_HANDLER (see
exceptions.cc) used the idiom

        debug(....); abort();

which can again cause infinite recursion - a #GP calls debug() which causes a
new #GP, which again calls debug, etc.

Instead of the above broken idiom, created a new function abort(msg), which is
just like the familiar abort(), just changes the "Aborted" message to some
other message (a constant string). Like abort(), the new variant abort(msg) will
only print the message once even if called recursively - and uses a lockless
version of debug().

Note that the new abort(msg) is a C++-only API. C will only see the abort(void)
which is extern "C". At first I wanted to call the new function panic(msg) and
export it to C, but gave when I saw the name panic() was already in use in a
bunch of BSD code.

e6208f1e

Jun 05, 2013

trace: improve fast path · b03979d9

Avi Kivity authored 11 years ago

When a tracepoint is disabled, we want it to have no impact on running code.

This patch changes the fast path to be a single 5-byte nop instruction. When
a tracepoint is enabled, the nop is patched to a jump instruction to the
out-of-line slow path.

b03979d9

May 27, 2013

Add "memory clobber" to STI and CLI instructions · a200bb7a

Nadav Har'El authored 11 years ago

When some code section happens to be called from both thread context and
interrupt context, and we need mutual exclusion (we don't want the interrupt
context to start while the critical section is in the middle of running in
thread context), we surround the critical code section with CLI and STI.

But we need the compiler to assure us that writes to memory done between
the calls to CLI and STI stay between them. For example, if we have

    thread context:                 interrupt handler:

      CLI;                          a--;
      a++;
      STI;

We don't want the a++ to be moved by the compiler before the CLI. We also
don't want the compiler to save a's value in a register and only actually
write it back to the memory location 'a' after the STI (when an interrupt
handler might be concurrently writing). We also don't want the compiler
to remember a's last value in a register and use it again after the next
CLI.

To ensure these things, we need the "memory clobber" option on both the CLI
and STI instructions. The "volatile" keyword is not enough - it guarantees
that the instruction isn't deleted or moved, but not that stuff that
should have been in memory isn't just in registers.

Note that Linux also has these memory clobbers on sti() and cli().
Linus Torvals explains in a post from 1996 why these were necessary:
http://lkml.indiana.edu/hypermail/linux/kernel/9605/0214.html

All that being said, we never noticed a bug caused by the missing
"memory" clobbers. But better safe than sorry....

a200bb7a

May 26, 2013

x64: use wrfsbase for faster context switching, when available · 3c9ba28d
Avi Kivity authored 11 years ago
```
Drops context switch time by ~80ns.
```
3c9ba28d
x64: add wrfsbase accessor · bb33c998
Avi Kivity authored 11 years ago
```
Faster way to write fsbase on newer processors.
```
bb33c998

signal handling: fix FPU clobbering bug · 94a7015e

Nadav Har'El authored 11 years ago

This patch adds missing FPU-state saving when calling signal handlers.
The state is saved on the stack, to allow nesting of signal handling
(delivery of a second signal while a first signal's handler is running).

In Linux calling conventions, the FPU state is caller-saved, i.e., a
called function can use FPU at will because the caller is assumed to have
saved it if needed. However, signal handlers are called asynchronously,
possibly in the middle of some FPU computation without that computation
getting a chance to save its state. So we must save this state before calling
the signal handling function.

Without this fix, we had problems even if the signal handlers themselves
did not use the FPU. A typical scenario - which we encountered in the
"sunflow" benchmark - is that the signal handler does something which uses
a mutex (e.g., malloc()) and causes a reschedule. The reschedule, not a
preempt(), thinks it does not need to save the FPU state, and the thread
we switch to clobbers this state.

94a7015e

May 18, 2013
- math: add extern "C" to __isnan · 147bd7a1
  Avi Kivity authored 11 years ago
  
  musl doesn't define __isnan, so we must mark as a C function.
  147bd7a1
May 07, 2013

sched: adjust stack deleter signature · 0fe76c0f

Avi Kivity authored 11 years ago

We want the size as well, to be able to munmap() pthread stacks.  Pass the
entire stack_info so we have this information.

0fe76c0f

May 06, 2013

boot: adjust boot loader for loaded payload size dynamically · 53fa5724

Avi Kivity authored 11 years ago

Since we're packing the entire file system into the boot image, it has
overflowed the 128MB limit that was set for it.

Adjust the boot loader during build time to account for the actual loaded size.

Fixes wierd corruption during startup.

53fa5724

May 01, 2013

Unify "mutex_t" and "mutex" types · 3c692eaa

Nadav Har'El authored 11 years ago

Previously we had two different mutex types - "mutex_t" defined by
<osv/mutex.h> for use in C code, and "mutex" defined by <mutex.hh>
for use in C++ code. This is difference is unnecessary, and causes
a mess for functions that need to accept either type, so they work
for both C++ and C code (e.g., consider condvar_wait()).

So after this commit, we have just one include file, <osv/mutex.h>
which works both in C and C++ code. This results in the same type
and same functions being defined, plus some additional conveniences
when in C++, such as method variants of the functions (e.g.,
m.lock() in addition to mutex_lock(m)), and the "with_lock" function.

The mutex type is now called either "mutex_t" or "struct mutex" in
C code, or can also be called just "mutex" in C++ code (all three
names refer to an identical type - there's no longer a different
mutex_t and mutex type).

This commit also modifies all the includers of <mutex.hh> to use
<osv/mutex.h>, and fixes a few miscelleneous compilation issues
that were discovered in the process.

3c692eaa