Commits · 9c724a4b9c8966d4c8cca92673d8a86d638e9bdc · Verlässliche Systemsoftware / projects / osv

Oct 01, 2013

x64: Enable sleeping in fault context · a449b889

Pekka Enberg authored 11 years ago


In preparation for enabling demand paging, enable sleeping in fault
context by using a per-thread exception stack for normal faults and
per-CPU exception stack for nested faults.

Avi Kivity explains:

  Before [demand paging] can even hope to work, we need to enable
  sleeping in fault context.  Right now each cpu has its own exception
  stack, which leads immediately to stack corruption:

  thread 1 faults
  enters exception stack
  tries to take mutex
  scheduler switches to thread 2
  thread 2 faults
  enters same exception stack

  So we need to switch stacks.  This can be done in the same way as for
  interrupt stacks (see thread::switch_to()).

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

a449b889

Sep 30, 2013

kvmclock: Implement support for old kvmclock MSRs. · e4ceea63

Venkatesh Srinivas authored 11 years ago


Older versions of KVM and user VMMs expose kvmclock MSRs at different
MSR offsets. Detect the old flag in kvmclock::probe() and use the old
MSRs if they are the only ones available.

Signed-off-by: Venkatesh Srinivas <venkateshs@google.com>
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>

e4ceea63

Sep 21, 2013

xen: use c++ interrupt handler · ea4cb9f6

Glauber Costa authored 11 years ago

Now that we have an efficient interrupt handler, use it.No need to delete the
old bsd code, just to avoid disrupting the file too much. Make sure through
an assertion that it is never used, though.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>

ea4cb9f6

xen: rework interrupt handler · a837afe5

Glauber Costa authored 11 years ago


This version of the Xen interrupt handler tries to do as less work as possible
in the interrupt itself. The previous version and my previous fix attempt would
still clean the channels during interrupt.

Because now we have pending_sel still set in the irq thread, we can ditch
_irq_pending completely.

There is now only one xen_irq for the entire system, and therefore I am
registering one per cpu, since we will eventually have to process this in
different cpus. (for different event channels).

With this, in my (very course, host to guest) netperf test, I am achieving
9600 * 10^6 bps, while linux can reach ~10000 * 10^bps. So we're getting close:

Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 65536  16384  16384    10.00    9589.32

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>

a837afe5

xen: declare shared types as atomic · de6ba640

Glauber Costa authored 11 years ago

Some of the fields in the xen shared structure need to be accessed atomically.
Move them to std::atomic so we can do that using C++11 primitives.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>

de6ba640

Sep 18, 2013

percpu: use correct percpu sect size to support 64 vcpus · 439bad31

Sasha Levin authored 11 years ago


percpu had too little space allocated to support 64 vcpus, which
lead to a crash when booting with more than 13 vcpus. Fix it by
using a correct size to support 64 vcpus.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

439bad31

Sep 15, 2013

Add Cloudius copyright to everything in arch/x64 · 6e918b0c

Nadav Har'El authored 11 years ago

Add Cloudius copyright to everything in arch/x64. This includes C++ code, assembly
code, and ld scripts.

6e918b0c

Sep 12, 2013
- Support for Xen w/o vector callbacks · 1d3e336c
  Dmitry Fleytman authored 11 years ago
  
  This patch implements GSI interrupt support for Xen bus. Needed in Xen environments w/o vector callbacks for HVM. One example of such an environment is Amazon EC2.
  1d3e336c
- Logic for GSI level triggered interrupt added · aeb82f51
  Dmitry Fleytman authored 11 years ago
  
  aeb82f51
Sep 11, 2013
- XAPIC support implemented · aa98b306
  Dmitry Fleytman authored 11 years ago
  
  XAPIC is supported as a fall-back when X2APIC is not available
  aa98b306
Sep 05, 2013

boot16.S: open up space for partition table · 4a6d51d5

Glauber Costa authored 11 years ago

Because we will be copying the bootloader code to the beginning of the disk, make
sure we won't step over the partition table space. This is technically not needed
if the code is small enough, but this guard code will 1) make sure that doesn't
happen, and 2) make sure the space is zeroed out.

The signature though, is needed, and is set to the bytes "O", "S" and "V", which
will span VSO in the end.

4a6d51d5

bootloader: move count32 variable · fcf173eb
Glauber Costa authored 11 years ago
```
It currently sits in the middle of the partition table. Move it to a safer
location.
```
fcf173eb

acpi: move table initialization to its own constructor · bf15592d

Glauber Costa authored 11 years ago

Right now we are doing it right before we parse the MADT, but this is by far
not MADT specific. Other users are planned, and the best way to resolve the
disputes is to have it in a separate constructor

bf15592d

Aug 28, 2013

work around xen x2apic bug · cc3d517a

Glauber Costa authored 11 years ago

The x2APIC specification says that reading from the X2APIC_ID MSR should return
the physical apic id of the current processor. However, the Xen implementation
(as of 4.2.2) is broken, and reads actually return old style xAPIC id. Even if
they fix it, we still have HVs deployed around that will return the wrong ID.
We can work around this by testing if the returned APIC id is in the form (id
<< 24), since in that case, the first 24 bits will all be zeroed. Then at least
we can get this working everywhere. This may pose a problem if we want to ever
support more than 1 << 24 vCPUs (or if any other HV has some random x2apic
ids), but that is highly unlikely anyway.

cc3d517a

apic: bringup cpus individually instead of all at the same time · 5cb16020

Glauber Costa authored 11 years ago

As I have described in a previous patch, the Xen hypervisor has a very nasty
bug that causes all of the x2apic msr writes to trigger a GPF. Although the
request proceeds fine despite the GPF, it does bring a problem for all-but-self
style init sequences we are using: after "failing" (succeeding but returning
failure) to deliver the interrupt for the first cpu in the group, xen will
break the loop, therefore not delivering the SIPIs to other cpus in the system
at all. We can work around that by delivering interrupts to each cpu
individually, instead of all-but-self.

5cb16020

implement wrmsr_safe · a7ea5784

Glauber Costa authored 11 years ago

Unfortunately, the Xen hypervisor has a very nasty bug (seems to be fixed by a
2013 patch - which means that although it is fixed, a lot of hypervisors will
have it), that causes all of the x2apic msr writes to init related registers
(INIT, SIPI, etc) trigger a GPF. The way to work around this, is to implement a
form of "wrmsr_safe".

a7ea5784

Aug 27, 2013

cpu: initialize the FPU and CSR register · 04ddff7a

Glauber Costa authored 11 years ago

We can't trust the state of the FPU and the CSR registers to be always sane.
Apparently, they aren't on at least one version of Xen (which happens to be
the one I am using) Initialize it manually for all CPUs on bringup.

04ddff7a

xen: correctly ack interrupts · bcf77dc9

Glauber Costa authored 11 years ago

In the xen interrupt code, I have made the mistake of exchanging the previous
value of _irq_pending with true, which means that we were constantly polling
for data in the interrupt threads.

This was responsible for the latency spikes I was seeing. The simple "ping"
test still shows bad results in absolute terms, but at least now the spikes are
gone.

bcf77dc9

Aug 26, 2013

signal: avoid nested signals · 4af36771

Avi Kivity authored 11 years ago

A signal within a signal handler is really bad news, abort when it happens
to let the developers debug it.

4af36771

mmu: don't pass really bad faults to the application · 6f464e76

Avi Kivity authored 11 years ago

Trying to execute the null pointer, or faults within the kernel code, are
a really bad sign and it's better to abort early with them.

6f464e76

Aug 21, 2013

libc: drop sse4.1 ceil(), floor() · 4bc96b95

Avi Kivity authored 11 years ago

The dependency on sse4.1 crashes on older cpus, use the generic musl
implementation.

4bc96b95

Aug 13, 2013

xen pv alternatives · 3f4c4e19

Glauber Costa authored 11 years ago

To make matters even clearer, enclose the main alternative macro in a
xen-specific macro, so we don't have to code xen's presence condition
everywhere.

3f4c4e19

xen: skeleton init sequence · 53ccd7a2

Avi Kivity authored 11 years ago

Add a xen_init() function (currently only stores the start_info pointer)
and jump to the normal init sequence.

[ glommer: rebased to current tip ]

53ccd7a2

xen: add xen metadata · 72f795e6

Avi Kivity authored 11 years ago

This metadata identifies the kernel to Xen and enables the pv loader.

[ glommer: adjusted to current tip and fixed header include ]

72f795e6

xen: set callback hypercall · 30313c4f

Glauber Costa authored 11 years ago

By issuing this hypercall, we can control where xen delivers the interrupt to.
Right now we will only support vectored callbacks. It should not be hard to extend
this for gsi and intx for HVM guests.

30313c4f

xen: build features array · 965639f3
Glauber Costa authored 11 years ago
```
If we have this array, BSD code that checks for features can run unmodified.
```
965639f3

Xen layer for interrupt registering · fbec6608

Glauber Costa authored 11 years ago

The BSD pv event channel will expect the underlying OS to be able to register a
PIC. For now we will just allow that for xen, and provide the expected
translation functions to allow xen to work.

The design I have chosen is to let the xen event handler run in interrupt
context. We can do threaded if it really becomes a problem, but right now it
should do. The handlers themselves, though, will be threaded. So the
intr_execute_handlers() function will do nothing more than to wake the
respective threads.

BSD will provide us functions, not threads. So we have a common thread that
executes the function that we were given. One exception for this is the xenstore.
The xenstore is already threaded, and its interrupt handler will also just wake
up a thread. So for that we could do better in the future.

fbec6608

xen: provide C accessible of HYPERVISOR_shared_info · 6fc3d663

Glauber Costa authored 11 years ago

Xen files in BSD (and Linux for that matter) expect a variable called
HYPERVISOR_shared_info that points to the hypercall page - that in our case is
statically defined. So we just need to point it with the correct name to our
shared info page.

Note the type mismatch: we are defining our own xen_shared_info to be able to
access some parts of the structure, like the wallclock, more conveniently.
Because of that, we need a type cast.

6fc3d663

generalize interrupt handler further · 4fbb1834

Glauber Costa authored 11 years ago

Xen does not need to EOI. At least not with the APIC anywyay: it signals
end of interrupt by flipping vcpu_info->evtchn_upcall_pending to 0, but
that is already done by the BSD handler, so we might as well do nothing.

Avi generalized the irq handler to have a pre_eoi and a handler, and in
this patch I am taking the extra step of adding an EOI indirection as well.

4fbb1834

Aug 12, 2013

trivial: name the event channel fields in the vcpu · 0873adf5

Glauber Costa authored 11 years ago

There are two spaces for event channel fields in the xen vcpu data.  They were
so far just a pad because we were not using event channels.  Name them, so we
can use it. I am also taking the opportunity to fix the tabs/spaces in the
structure.

0873adf5

Aug 04, 2013

build: improve .text section generation · 3d58de8e

Avi Kivity authored 11 years ago

gcc generates some functions in their own section.  Have a wildcard that
catches all of these sections so they can all be merged into the global
.text section; this makes 'perf kvm top' format its output better.

The catch-all wildcard is placed last since ld uses the first match.

3d58de8e

string: fast memcpy and memset for machines without ERMS · 7f3df3ef
Avi Kivity authored 11 years ago
```
Not all machines have Enhanced REP MOVSB/STOSB (ERMS); provide optimized
fallbacks.
```
7f3df3ef

Jul 31, 2013

make initialization priorities explicit · f801c763

Glauber Costa authored 11 years ago

I have recently ran into an early init bug due that ended up being tracked
down to a changeset in which the initialization priorities of the constructors
were changed. One of the changed ones was kvmclock, but the change did not
update kvmclock.

I propose we use constants for that. To avoid things like this in the future,
wherever priorities are used, I believe they should come from the same place
so that the order is utterly obvious. To handle that, I am creating the prio.hh
file, and sticking all priority definitions in there.

f801c763

Jul 29, 2013

build: avoid using __attribute__((cold) if the compiler doesn't support it · 6d151431
Avi Kivity authored 11 years ago
```
__attribute__((cold)) on labels isn't supported by gcc 4.7.  Detect support
for the feature and enable it conditionally.
```
6d151431

apic: kvm pv eoi · d450e60d

Avi Kivity authored 11 years ago

kvm provides a way to EOI without exiting; detect its availability and use it.

d450e60d

interrupt: use the apic's eoi() method instead of directly writing to the x2apic · 66bd77a0
Avi Kivity authored 11 years ago
```
Works with xapic, and allows us to hook eoi().
```
66bd77a0
apic: add read() method · 5f8652e9
Avi Kivity authored 11 years ago

5f8652e9

interrupt: add separate pre- and post-eoi interrupt handler · f496f7a7

Avi Kivity authored 11 years ago

kvm's pv_eoi functionality fails if any vmexits are taken before the EOI
(likely due to a bug).  Work around this by deferring the handler until after
the interrupt has been EOIed.  Since level-triggered interrupts must be acked
at the device level prior to the EOI, add an additional handler to be called
prior to the EOI.

f496f7a7

percpu: align percpu data · 38b6390a

Avi Kivity authored 11 years ago

Make sure percpu data is aligned correctly as some hardware features expect it.

38b6390a

Jul 28, 2013
- ioapic: switch to new with_lock() · 35aa5c63
  Avi Kivity authored 11 years ago
  
  35aa5c63