- Oct 01, 2013
-
-
Pekka Enberg authored
In preparation for enabling demand paging, enable sleeping in fault context by using a per-thread exception stack for normal faults and per-CPU exception stack for nested faults. Avi Kivity explains: Before [demand paging] can even hope to work, we need to enable sleeping in fault context. Right now each cpu has its own exception stack, which leads immediately to stack corruption: thread 1 faults enters exception stack tries to take mutex scheduler switches to thread 2 thread 2 faults enters same exception stack So we need to switch stacks. This can be done in the same way as for interrupt stacks (see thread::switch_to()). Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
- Sep 30, 2013
-
-
Venkatesh Srinivas authored
Older versions of KVM and user VMMs expose kvmclock MSRs at different MSR offsets. Detect the old flag in kvmclock::probe() and use the old MSRs if they are the only ones available. Signed-off-by:
Venkatesh Srinivas <venkateshs@google.com> Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com>
-
- Sep 21, 2013
-
-
Glauber Costa authored
Now that we have an efficient interrupt handler, use it.No need to delete the old bsd code, just to avoid disrupting the file too much. Make sure through an assertion that it is never used, though. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com>
-
Glauber Costa authored
This version of the Xen interrupt handler tries to do as less work as possible in the interrupt itself. The previous version and my previous fix attempt would still clean the channels during interrupt. Because now we have pending_sel still set in the irq thread, we can ditch _irq_pending completely. There is now only one xen_irq for the entire system, and therefore I am registering one per cpu, since we will eventually have to process this in different cpus. (for different event channels). With this, in my (very course, host to guest) netperf test, I am achieving 9600 * 10^6 bps, while linux can reach ~10000 * 10^bps. So we're getting close: Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 65536 16384 16384 10.00 9589.32 Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com>
-
Glauber Costa authored
Some of the fields in the xen shared structure need to be accessed atomically. Move them to std::atomic so we can do that using C++11 primitives. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com>
-
- Sep 18, 2013
-
-
Sasha Levin authored
percpu had too little space allocated to support 64 vcpus, which lead to a crash when booting with more than 13 vcpus. Fix it by using a correct size to support 64 vcpus. Signed-off-by:
Sasha Levin <levinsasha928@gmail.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
- Sep 15, 2013
-
-
Nadav Har'El authored
Add Cloudius copyright to everything in arch/x64. This includes C++ code, assembly code, and ld scripts.
-
- Sep 12, 2013
-
-
Dmitry Fleytman authored
This patch implements GSI interrupt support for Xen bus. Needed in Xen environments w/o vector callbacks for HVM. One example of such an environment is Amazon EC2.
-
Dmitry Fleytman authored
-
- Sep 11, 2013
-
-
Dmitry Fleytman authored
XAPIC is supported as a fall-back when X2APIC is not available
-
- Sep 05, 2013
-
-
Glauber Costa authored
Because we will be copying the bootloader code to the beginning of the disk, make sure we won't step over the partition table space. This is technically not needed if the code is small enough, but this guard code will 1) make sure that doesn't happen, and 2) make sure the space is zeroed out. The signature though, is needed, and is set to the bytes "O", "S" and "V", which will span VSO in the end.
-
Glauber Costa authored
It currently sits in the middle of the partition table. Move it to a safer location.
-
Glauber Costa authored
Right now we are doing it right before we parse the MADT, but this is by far not MADT specific. Other users are planned, and the best way to resolve the disputes is to have it in a separate constructor
-
- Aug 28, 2013
-
-
Glauber Costa authored
The x2APIC specification says that reading from the X2APIC_ID MSR should return the physical apic id of the current processor. However, the Xen implementation (as of 4.2.2) is broken, and reads actually return old style xAPIC id. Even if they fix it, we still have HVs deployed around that will return the wrong ID. We can work around this by testing if the returned APIC id is in the form (id << 24), since in that case, the first 24 bits will all be zeroed. Then at least we can get this working everywhere. This may pose a problem if we want to ever support more than 1 << 24 vCPUs (or if any other HV has some random x2apic ids), but that is highly unlikely anyway.
-
Glauber Costa authored
As I have described in a previous patch, the Xen hypervisor has a very nasty bug that causes all of the x2apic msr writes to trigger a GPF. Although the request proceeds fine despite the GPF, it does bring a problem for all-but-self style init sequences we are using: after "failing" (succeeding but returning failure) to deliver the interrupt for the first cpu in the group, xen will break the loop, therefore not delivering the SIPIs to other cpus in the system at all. We can work around that by delivering interrupts to each cpu individually, instead of all-but-self.
-
Glauber Costa authored
Unfortunately, the Xen hypervisor has a very nasty bug (seems to be fixed by a 2013 patch - which means that although it is fixed, a lot of hypervisors will have it), that causes all of the x2apic msr writes to init related registers (INIT, SIPI, etc) trigger a GPF. The way to work around this, is to implement a form of "wrmsr_safe".
-
- Aug 27, 2013
-
-
Glauber Costa authored
We can't trust the state of the FPU and the CSR registers to be always sane. Apparently, they aren't on at least one version of Xen (which happens to be the one I am using) Initialize it manually for all CPUs on bringup.
-
Glauber Costa authored
In the xen interrupt code, I have made the mistake of exchanging the previous value of _irq_pending with true, which means that we were constantly polling for data in the interrupt threads. This was responsible for the latency spikes I was seeing. The simple "ping" test still shows bad results in absolute terms, but at least now the spikes are gone.
-
- Aug 26, 2013
-
-
Avi Kivity authored
A signal within a signal handler is really bad news, abort when it happens to let the developers debug it.
-
Avi Kivity authored
Trying to execute the null pointer, or faults within the kernel code, are a really bad sign and it's better to abort early with them.
-
- Aug 21, 2013
-
-
Avi Kivity authored
The dependency on sse4.1 crashes on older cpus, use the generic musl implementation.
-
- Aug 13, 2013
-
-
Glauber Costa authored
To make matters even clearer, enclose the main alternative macro in a xen-specific macro, so we don't have to code xen's presence condition everywhere.
-
Avi Kivity authored
Add a xen_init() function (currently only stores the start_info pointer) and jump to the normal init sequence. [ glommer: rebased to current tip ]
-
Avi Kivity authored
This metadata identifies the kernel to Xen and enables the pv loader. [ glommer: adjusted to current tip and fixed header include ]
-
Glauber Costa authored
By issuing this hypercall, we can control where xen delivers the interrupt to. Right now we will only support vectored callbacks. It should not be hard to extend this for gsi and intx for HVM guests.
-
Glauber Costa authored
If we have this array, BSD code that checks for features can run unmodified.
-
Glauber Costa authored
The BSD pv event channel will expect the underlying OS to be able to register a PIC. For now we will just allow that for xen, and provide the expected translation functions to allow xen to work. The design I have chosen is to let the xen event handler run in interrupt context. We can do threaded if it really becomes a problem, but right now it should do. The handlers themselves, though, will be threaded. So the intr_execute_handlers() function will do nothing more than to wake the respective threads. BSD will provide us functions, not threads. So we have a common thread that executes the function that we were given. One exception for this is the xenstore. The xenstore is already threaded, and its interrupt handler will also just wake up a thread. So for that we could do better in the future.
-
Glauber Costa authored
Xen files in BSD (and Linux for that matter) expect a variable called HYPERVISOR_shared_info that points to the hypercall page - that in our case is statically defined. So we just need to point it with the correct name to our shared info page. Note the type mismatch: we are defining our own xen_shared_info to be able to access some parts of the structure, like the wallclock, more conveniently. Because of that, we need a type cast.
-
Glauber Costa authored
Xen does not need to EOI. At least not with the APIC anywyay: it signals end of interrupt by flipping vcpu_info->evtchn_upcall_pending to 0, but that is already done by the BSD handler, so we might as well do nothing. Avi generalized the irq handler to have a pre_eoi and a handler, and in this patch I am taking the extra step of adding an EOI indirection as well.
-
- Aug 12, 2013
-
-
Glauber Costa authored
There are two spaces for event channel fields in the xen vcpu data. They were so far just a pad because we were not using event channels. Name them, so we can use it. I am also taking the opportunity to fix the tabs/spaces in the structure.
-
- Aug 04, 2013
-
-
Avi Kivity authored
gcc generates some functions in their own section. Have a wildcard that catches all of these sections so they can all be merged into the global .text section; this makes 'perf kvm top' format its output better. The catch-all wildcard is placed last since ld uses the first match.
-
Avi Kivity authored
Not all machines have Enhanced REP MOVSB/STOSB (ERMS); provide optimized fallbacks.
-
- Jul 31, 2013
-
-
Glauber Costa authored
I have recently ran into an early init bug due that ended up being tracked down to a changeset in which the initialization priorities of the constructors were changed. One of the changed ones was kvmclock, but the change did not update kvmclock. I propose we use constants for that. To avoid things like this in the future, wherever priorities are used, I believe they should come from the same place so that the order is utterly obvious. To handle that, I am creating the prio.hh file, and sticking all priority definitions in there.
-
- Jul 29, 2013
-
-
Avi Kivity authored
__attribute__((cold)) on labels isn't supported by gcc 4.7. Detect support for the feature and enable it conditionally.
-
Avi Kivity authored
kvm provides a way to EOI without exiting; detect its availability and use it.
-
Avi Kivity authored
Works with xapic, and allows us to hook eoi().
-
Avi Kivity authored
-
Avi Kivity authored
kvm's pv_eoi functionality fails if any vmexits are taken before the EOI (likely due to a bug). Work around this by deferring the handler until after the interrupt has been EOIed. Since level-triggered interrupts must be acked at the device level prior to the EOI, add an additional handler to be called prior to the EOI.
-
Avi Kivity authored
Make sure percpu data is aligned correctly as some hardware features expect it.
-
- Jul 28, 2013
-
-
Avi Kivity authored
-