Commits · 858a228989dd9f1de418ea27e0caf2e8d65cb3cc · Verlässliche Systemsoftware / projects / osv

Oct 24, 2013

arch-cpu.hh: Fix arch_thread forward declaration · ba2abdf7

Pekka Enberg authored 11 years ago


Spotted by Clang:

../../arch/x64/arch-cpu.hh:57:1: error: 'arch_thread' defined as a struct here but previously
      declared as a class [-Werror,-Wmismatched-tags]
struct arch_thread {
^
../../arch/x64/arch-cpu.hh:37:1: note: did you mean struct here?
class arch_thread;
^~~~~
struct

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

ba2abdf7

arch-cpu.hh: Fix arch_cpu forward declaration · 766b9719

Pekka Enberg authored 11 years ago


Spotted by Clang:

../../include/sched.hh:278:12: error: class 'arch_cpu' was previously declared as a struct
      [-Werror,-Wmismatched-tags]
    friend class arch_cpu;
           ^
../../arch/x64/arch-cpu.hh:39:8: note: previous use is here
struct arch_cpu {
       ^
../../include/sched.hh:278:12: note: did you mean struct here?
    friend class arch_cpu;
           ^~~~~
           struct

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

766b9719

Oct 23, 2013

Fix top of call stack - and treatment of unhandled C++ exceptions · 7fc023e8

Nadav Har'El authored 11 years ago


As noticed by Tomek in issue #64, unhandled C++ exceptions cause OSv to
silently hang, in an endless loop inside the unwinding code.

So this patch fixes the wrong CFI (DWARF Call Frame Information) which
caused the unwinder to loop. We just had a single line of assembly missing:
The topmost frame - the thread's main function - needs to undefine the
saved %rip to prevent going further back. If we don't do that, gdb will
end every "bt" output with a warning "Frame did not save its PC" (but hey,
nobody complained... ;-)), and the unwinding library, will, unfortunately,
go into an endless loop as seen in issue #64.

With this one-line patch, unhandled exceptions now work as expected -
they abort with a message like:

	terminate called after throwing an instance of 'int'
	Aborted

And attaching a debugger you can see exactly where the offending throw came
from (i.e., the stack does *not* unnecessarily unwind when there's nobody
waiting to catch the exception).

This works for uncaught exceptions anywhere - including inside main()
and from constructors when loading the object (before running main()).

"bt" in gdb also no longer ends each stack trace with an error message.
The last frame it shows is "thread_main()".

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>

7fc023e8

Oct 22, 2013

x64: Make dump_register() fault-handler safe · ab623c9f

Pekka Enberg authored 11 years ago


The debug() call can deadlock because it's using boost format. Switch to
debug_ll().

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

ab623c9f

x64: Fix missing newline in dump_registers() · e8b33142

Pekka Enberg authored 11 years ago


The debug() format string is missing a newline. Fix that up.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

e8b33142

Fix linker error in make mode=debug · 7f9faa65

Tomasz Grabiec authored 11 years ago


This is a workaround for linker error when compiling with -O0

  `.text._Z9safe_loadIcEbPKT_RS0_' referenced in section `.text.fixup'
  of core/mmu.o: defined in discarded section
  `.text._Z9safe_loadIcEbPKT_RS0_[_Z9safe_loadIcEbPKT_RS0_]' of
  core/mmu.o

The safe_load() template is used in both runtime.cc and core/mmu.cc
but the linker keeps it only in one section discarding the other.

Signed-off-by: Tomasz Grabiec <tgrabiec@cloudius-systems.com>

7f9faa65

Oct 16, 2013

x64: Register dump on GP fault · ca52fa23

Pekka Enberg authored 11 years ago

Dump registers on general protection fault for debugging purposes. Even
if you have gdb available, getting to the exception frame is not always
possible after OSv has crashed.

Example output looks as follows:

registers:
RIP: 0x0000100000b7e913 RFL: 0x0000000000010202 CS: 0x0000000000000008 SS: 0x0000000000000010
RAX: 0xffffc000418ed278 RBX: 0xffffc00041b2c050 RCX: 0x0000000000000004 RDX: 0x0000000000000000
RSI: 0x0000000000000001 RDI: 0x43e0000000000000 RBP: 0x0000200008548d10 R8: 0xffffc000426e3010
R9: 0x0000000000000004 R10: 0x43e0000000000000 R11: 0xffffc00041b2c050 R12: 0xffffc000418ed1e8
R13: 0x0000000000000004 R14: 0x43e0000000000000 R15: 0xffffc00041b2c050 RSP: 0x0000200008548aa0
general protection fault

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

ca52fa23

Oct 11, 2013

x64: Fix nested exception debugging · 82301253

Pekka Enberg authored 11 years ago


As of commit a449b889 ("x64: Enable sleeping in fault context") it's now
safe for another thread to enter a fault handler on the same CPU.  Fix
exception guard to reflect that.

This is needed for demand paging where a page fault from another thread
can happen on the same CPU where a thread is sleeping in the page fault
handler.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

82301253

Oct 10, 2013

build: define _KERNEL everywhere · 95ce17e3

Avi Kivity authored 11 years ago

We have _KERNEL defines scattered throughout the code, which makes
understanding it difficult.

Define it just once, and adjust the source to build.

We define it in an overridable variable, so that non-kernel imported code
can undo it.

95ce17e3

Oct 01, 2013

x64: Enable sleeping in fault context · a449b889

Pekka Enberg authored 11 years ago


In preparation for enabling demand paging, enable sleeping in fault
context by using a per-thread exception stack for normal faults and
per-CPU exception stack for nested faults.

Avi Kivity explains:

  Before [demand paging] can even hope to work, we need to enable
  sleeping in fault context.  Right now each cpu has its own exception
  stack, which leads immediately to stack corruption:

  thread 1 faults
  enters exception stack
  tries to take mutex
  scheduler switches to thread 2
  thread 2 faults
  enters same exception stack

  So we need to switch stacks.  This can be done in the same way as for
  interrupt stacks (see thread::switch_to()).

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

a449b889

Sep 30, 2013

kvmclock: Implement support for old kvmclock MSRs. · e4ceea63

Venkatesh Srinivas authored 11 years ago


Older versions of KVM and user VMMs expose kvmclock MSRs at different
MSR offsets. Detect the old flag in kvmclock::probe() and use the old
MSRs if they are the only ones available.

Signed-off-by: Venkatesh Srinivas <venkateshs@google.com>
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>

e4ceea63

Sep 21, 2013

xen: use c++ interrupt handler · ea4cb9f6

Glauber Costa authored 11 years ago

Now that we have an efficient interrupt handler, use it.No need to delete the
old bsd code, just to avoid disrupting the file too much. Make sure through
an assertion that it is never used, though.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>

ea4cb9f6

xen: rework interrupt handler · a837afe5

Glauber Costa authored 11 years ago


This version of the Xen interrupt handler tries to do as less work as possible
in the interrupt itself. The previous version and my previous fix attempt would
still clean the channels during interrupt.

Because now we have pending_sel still set in the irq thread, we can ditch
_irq_pending completely.

There is now only one xen_irq for the entire system, and therefore I am
registering one per cpu, since we will eventually have to process this in
different cpus. (for different event channels).

With this, in my (very course, host to guest) netperf test, I am achieving
9600 * 10^6 bps, while linux can reach ~10000 * 10^bps. So we're getting close:

Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 65536  16384  16384    10.00    9589.32

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>

a837afe5

xen: declare shared types as atomic · de6ba640

Glauber Costa authored 11 years ago

Some of the fields in the xen shared structure need to be accessed atomically.
Move them to std::atomic so we can do that using C++11 primitives.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>

de6ba640

Sep 18, 2013

percpu: use correct percpu sect size to support 64 vcpus · 439bad31

Sasha Levin authored 11 years ago


percpu had too little space allocated to support 64 vcpus, which
lead to a crash when booting with more than 13 vcpus. Fix it by
using a correct size to support 64 vcpus.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

439bad31

Sep 15, 2013

Add Cloudius copyright to everything in arch/x64 · 6e918b0c

Nadav Har'El authored 11 years ago

Add Cloudius copyright to everything in arch/x64. This includes C++ code, assembly
code, and ld scripts.

6e918b0c

Sep 12, 2013
- Support for Xen w/o vector callbacks · 1d3e336c
  Dmitry Fleytman authored 11 years ago
  
  This patch implements GSI interrupt support for Xen bus. Needed in Xen environments w/o vector callbacks for HVM. One example of such an environment is Amazon EC2.
  1d3e336c
- Logic for GSI level triggered interrupt added · aeb82f51
  Dmitry Fleytman authored 11 years ago
  
  aeb82f51
Sep 11, 2013
- XAPIC support implemented · aa98b306
  Dmitry Fleytman authored 11 years ago
  
  XAPIC is supported as a fall-back when X2APIC is not available
  aa98b306
Sep 05, 2013

boot16.S: open up space for partition table · 4a6d51d5

Glauber Costa authored 11 years ago

Because we will be copying the bootloader code to the beginning of the disk, make
sure we won't step over the partition table space. This is technically not needed
if the code is small enough, but this guard code will 1) make sure that doesn't
happen, and 2) make sure the space is zeroed out.

The signature though, is needed, and is set to the bytes "O", "S" and "V", which
will span VSO in the end.

4a6d51d5

bootloader: move count32 variable · fcf173eb
Glauber Costa authored 11 years ago
```
It currently sits in the middle of the partition table. Move it to a safer
location.
```
fcf173eb

acpi: move table initialization to its own constructor · bf15592d

Glauber Costa authored 11 years ago

Right now we are doing it right before we parse the MADT, but this is by far
not MADT specific. Other users are planned, and the best way to resolve the
disputes is to have it in a separate constructor

bf15592d

Aug 28, 2013

work around xen x2apic bug · cc3d517a

Glauber Costa authored 11 years ago

The x2APIC specification says that reading from the X2APIC_ID MSR should return
the physical apic id of the current processor. However, the Xen implementation
(as of 4.2.2) is broken, and reads actually return old style xAPIC id. Even if
they fix it, we still have HVs deployed around that will return the wrong ID.
We can work around this by testing if the returned APIC id is in the form (id
<< 24), since in that case, the first 24 bits will all be zeroed. Then at least
we can get this working everywhere. This may pose a problem if we want to ever
support more than 1 << 24 vCPUs (or if any other HV has some random x2apic
ids), but that is highly unlikely anyway.

cc3d517a

apic: bringup cpus individually instead of all at the same time · 5cb16020

Glauber Costa authored 11 years ago

As I have described in a previous patch, the Xen hypervisor has a very nasty
bug that causes all of the x2apic msr writes to trigger a GPF. Although the
request proceeds fine despite the GPF, it does bring a problem for all-but-self
style init sequences we are using: after "failing" (succeeding but returning
failure) to deliver the interrupt for the first cpu in the group, xen will
break the loop, therefore not delivering the SIPIs to other cpus in the system
at all. We can work around that by delivering interrupts to each cpu
individually, instead of all-but-self.

5cb16020

implement wrmsr_safe · a7ea5784

Glauber Costa authored 11 years ago

Unfortunately, the Xen hypervisor has a very nasty bug (seems to be fixed by a
2013 patch - which means that although it is fixed, a lot of hypervisors will
have it), that causes all of the x2apic msr writes to init related registers
(INIT, SIPI, etc) trigger a GPF. The way to work around this, is to implement a
form of "wrmsr_safe".

a7ea5784

Aug 27, 2013

cpu: initialize the FPU and CSR register · 04ddff7a

Glauber Costa authored 11 years ago

We can't trust the state of the FPU and the CSR registers to be always sane.
Apparently, they aren't on at least one version of Xen (which happens to be
the one I am using) Initialize it manually for all CPUs on bringup.

04ddff7a

xen: correctly ack interrupts · bcf77dc9

Glauber Costa authored 11 years ago

In the xen interrupt code, I have made the mistake of exchanging the previous
value of _irq_pending with true, which means that we were constantly polling
for data in the interrupt threads.

This was responsible for the latency spikes I was seeing. The simple "ping"
test still shows bad results in absolute terms, but at least now the spikes are
gone.

bcf77dc9

Aug 26, 2013

signal: avoid nested signals · 4af36771

Avi Kivity authored 11 years ago

A signal within a signal handler is really bad news, abort when it happens
to let the developers debug it.

4af36771

mmu: don't pass really bad faults to the application · 6f464e76

Avi Kivity authored 11 years ago

Trying to execute the null pointer, or faults within the kernel code, are
a really bad sign and it's better to abort early with them.

6f464e76

Aug 21, 2013

libc: drop sse4.1 ceil(), floor() · 4bc96b95

Avi Kivity authored 11 years ago

The dependency on sse4.1 crashes on older cpus, use the generic musl
implementation.

4bc96b95

Aug 13, 2013

xen pv alternatives · 3f4c4e19

Glauber Costa authored 11 years ago

To make matters even clearer, enclose the main alternative macro in a
xen-specific macro, so we don't have to code xen's presence condition
everywhere.

3f4c4e19

xen: skeleton init sequence · 53ccd7a2

Avi Kivity authored 11 years ago

Add a xen_init() function (currently only stores the start_info pointer)
and jump to the normal init sequence.

[ glommer: rebased to current tip ]

53ccd7a2

xen: add xen metadata · 72f795e6

Avi Kivity authored 11 years ago

This metadata identifies the kernel to Xen and enables the pv loader.

[ glommer: adjusted to current tip and fixed header include ]

72f795e6

xen: set callback hypercall · 30313c4f

Glauber Costa authored 11 years ago

By issuing this hypercall, we can control where xen delivers the interrupt to.
Right now we will only support vectored callbacks. It should not be hard to extend
this for gsi and intx for HVM guests.

30313c4f

xen: build features array · 965639f3
Glauber Costa authored 11 years ago
```
If we have this array, BSD code that checks for features can run unmodified.
```
965639f3

Xen layer for interrupt registering · fbec6608

Glauber Costa authored 11 years ago

The BSD pv event channel will expect the underlying OS to be able to register a
PIC. For now we will just allow that for xen, and provide the expected
translation functions to allow xen to work.

The design I have chosen is to let the xen event handler run in interrupt
context. We can do threaded if it really becomes a problem, but right now it
should do. The handlers themselves, though, will be threaded. So the
intr_execute_handlers() function will do nothing more than to wake the
respective threads.

BSD will provide us functions, not threads. So we have a common thread that
executes the function that we were given. One exception for this is the xenstore.
The xenstore is already threaded, and its interrupt handler will also just wake
up a thread. So for that we could do better in the future.

fbec6608

xen: provide C accessible of HYPERVISOR_shared_info · 6fc3d663

Glauber Costa authored 11 years ago

Xen files in BSD (and Linux for that matter) expect a variable called
HYPERVISOR_shared_info that points to the hypercall page - that in our case is
statically defined. So we just need to point it with the correct name to our
shared info page.

Note the type mismatch: we are defining our own xen_shared_info to be able to
access some parts of the structure, like the wallclock, more conveniently.
Because of that, we need a type cast.

6fc3d663

generalize interrupt handler further · 4fbb1834

Glauber Costa authored 11 years ago

Xen does not need to EOI. At least not with the APIC anywyay: it signals
end of interrupt by flipping vcpu_info->evtchn_upcall_pending to 0, but
that is already done by the BSD handler, so we might as well do nothing.

Avi generalized the irq handler to have a pre_eoi and a handler, and in
this patch I am taking the extra step of adding an EOI indirection as well.

4fbb1834

Aug 12, 2013

trivial: name the event channel fields in the vcpu · 0873adf5

Glauber Costa authored 11 years ago

There are two spaces for event channel fields in the xen vcpu data.  They were
so far just a pad because we were not using event channels.  Name them, so we
can use it. I am also taking the opportunity to fix the tabs/spaces in the
structure.

0873adf5

Aug 04, 2013

build: improve .text section generation · 3d58de8e

Avi Kivity authored 11 years ago

gcc generates some functions in their own section.  Have a wildcard that
catches all of these sections so they can all be merged into the global
.text section; this makes 'perf kvm top' format its output better.

The catch-all wildcard is placed last since ld uses the first match.

3d58de8e