Skip to content
Snippets Groups Projects
  1. Dec 01, 2013
  2. Nov 26, 2013
    • Nadav Har'El's avatar
      Reduce number of unnecessary sections in our executable · 03aaf6b8
      Nadav Har'El authored
      
      This patch resolves issue #26. As you can see with "objdump -h
      build/release/loader.elf", our executable had over a thousand (!)
      separate sections, most of them should really be merged.
      We already started doing this in arch/x64/loader.ld, but didn't
      complete the work.
      
      This patch merges all the ".gcc_except_table.*" sections into one,
      and all the ".data.rel.ro.*" sections into one. After this merge,
      we are left with just 52 sections, instead of more than 1000.
      
      The default linker script (run "ld --verbose" to see it) also does
      similar merges, so there's no reason why we shouldn't.
      
      By reducing the number of ELF sections (each comes with a name, headers,
      etc.), this patch also reduces the size of our loader-stripped.elf
      by about 140K.
      
      Fixes #26.
      
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      03aaf6b8
    • Dmitry Fleytman's avatar
      xen: move per-cpu interrupt threads to .percpu section · 63d2e472
      Dmitry Fleytman authored
      
      Bug fixed by this patch made OSv crash on Xen during boot.
      The problem started to show up after commit:
      
        commit ed808267
        Author: Nadav Har'El <nyh@cloudius-systems.com>
        Date:   Mon Nov 18 23:01:09 2013 +0200
      
            percpu: Reduce size of .percpu section
      
      Signed-off-by: default avatarDmitry Fleytman <dmitry@daynix.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      63d2e472
  3. Nov 21, 2013
  4. Nov 19, 2013
    • Nadav Har'El's avatar
      percpu: Reduce size of .percpu section · ed808267
      Nadav Har'El authored
      
      This patch reduces the size of the .percpu section 64-fold from about
      5 MB to 70 KB, and solves issue #95.
      
      The ".percpu" section is part of the .data section of our executable
      (loader-stripped.elf). In our 15 MB executable, roughly 7 MB is text
      (code), and 7 MB is data, and out of that, a whopping 5 MB is the
      ".percpu" section. The executable is read in real mode, and this is
      especially slow on Amazon EC2, hence our wish to make the executable
      as small as possible.
      
      The percpu section starts with all the PERCPU variables defined in the
      program. We have about 70 KB of those, and believe it or not, most of
      this 70 KB is just a single variable, the 65K dynamic_percpu_buffer
      (see percpu.cc).
      
      But then, we need a copy of these variables for each CPU. The unpatched
      code duplicated this 70KB section 64 times in the executable file (!),
      and then used these memory locations for up-to-64 cpus. But there is
      no reason to duplicate this data in the executable! All we need to do
      is to dynamically allocate a copy of this section for each CPU, and
      this is what this patch does.
      
      This patch removes about 5 MB from our executable: After this patch,
      our loader-stripped.elf is just 9.7 MB, and its data section's size is
      just 2.8 MB.
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      ed808267
  5. Nov 12, 2013
  6. Nov 11, 2013
  7. Nov 07, 2013
  8. Nov 04, 2013
  9. Nov 01, 2013
  10. Oct 30, 2013
    • Avi Kivity's avatar
      x64: fix TLS segment alignment · 15d30a11
      Avi Kivity authored
      
      The TLS segment is a little wierd in that it grows backwards from percpu_base
      instead of forwards.  This causes the alignment code to calculate wrong
      offsets when the segment size is 8 (mod 16).  A failure was seen where
      ::percpu_base was set at offset 0xfffffffffffffa08 in code that was in the
      same translation unit as ::percpu_base, and 0xfffffffffffffa10 elsewhere.
      This caused all dynamic_percpu instances to crash.
      
      Fix by aligning the segment size.  For good measure, align also the segment
      base, both to a cacheline boundary.
      
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      15d30a11
  11. Oct 28, 2013
    • Avi Kivity's avatar
      x64: fix APIC ID register read on QEMU · c19e28f4
      Avi Kivity authored
      
      The expression *p >> 24, where p is an unsigned*, is optimized by
      the compiler to *((u8*)p + 3) - reading the most significant byte only and
      dropping the shift.
      
      When this optimization is applied to reading the APIC ID, QEMU
      returns zero for all processors, since the manual requires reading
      entire words.
      
      Fix by using a volatile pointer, disabling the optimization.
      
      Note that QEMU is technically correct here though it violates all known
      real x86 implementations.
      
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      c19e28f4
    • Avi Kivity's avatar
      x64: fix xapic INIT IPI · e7ba2287
      Avi Kivity authored
      
      xapic::init_ipi() shifts apic_id by 24, unaware that xapic::ipi will do it
      again.  The result is that the boot processor is reset instead of the
      auxiliary processor.
      
      Remove the extraneous shift.
      
      Found by booting with QEMU without kvm.
      
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      e7ba2287
  12. Oct 24, 2013
  13. Oct 23, 2013
    • Nadav Har'El's avatar
      Fix top of call stack - and treatment of unhandled C++ exceptions · 7fc023e8
      Nadav Har'El authored
      
      As noticed by Tomek in issue #64, unhandled C++ exceptions cause OSv to
      silently hang, in an endless loop inside the unwinding code.
      
      So this patch fixes the wrong CFI (DWARF Call Frame Information) which
      caused the unwinder to loop. We just had a single line of assembly missing:
      The topmost frame - the thread's main function - needs to undefine the
      saved %rip to prevent going further back. If we don't do that, gdb will
      end every "bt" output with a warning "Frame did not save its PC" (but hey,
      nobody complained... ;-)), and the unwinding library, will, unfortunately,
      go into an endless loop as seen in issue #64.
      
      With this one-line patch, unhandled exceptions now work as expected -
      they abort with a message like:
      
      	terminate called after throwing an instance of 'int'
      	Aborted
      
      And attaching a debugger you can see exactly where the offending throw came
      from (i.e., the stack does *not* unnecessarily unwind when there's nobody
      waiting to catch the exception).
      
      This works for uncaught exceptions anywhere - including inside main()
      and from constructors when loading the object (before running main()).
      
      "bt" in gdb also no longer ends each stack trace with an error message.
      The last frame it shows is "thread_main()".
      
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      7fc023e8
  14. Oct 22, 2013
  15. Oct 16, 2013
    • Pekka Enberg's avatar
      x64: Register dump on GP fault · ca52fa23
      Pekka Enberg authored
      
      Dump registers on general protection fault for debugging purposes.  Even
      if you have gdb available, getting to the exception frame is not always
      possible after OSv has crashed.
      
      Example output looks as follows:
      
      registers:
      RIP: 0x0000100000b7e913  RFL: 0x0000000000010202  CS:  0x0000000000000008  SS:  0x0000000000000010
      RAX: 0xffffc000418ed278  RBX: 0xffffc00041b2c050  RCX: 0x0000000000000004  RDX: 0x0000000000000000
      RSI: 0x0000000000000001  RDI: 0x43e0000000000000  RBP: 0x0000200008548d10  R8:  0xffffc000426e3010
      R9:  0x0000000000000004  R10: 0x43e0000000000000  R11: 0xffffc00041b2c050  R12: 0xffffc000418ed1e8
      R13: 0x0000000000000004  R14: 0x43e0000000000000  R15: 0xffffc00041b2c050  RSP: 0x0000200008548aa0
      general protection fault
      
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      ca52fa23
  16. Oct 11, 2013
    • Pekka Enberg's avatar
      x64: Fix nested exception debugging · 82301253
      Pekka Enberg authored
      
      As of commit a449b889 ("x64: Enable sleeping in fault context") it's now
      safe for another thread to enter a fault handler on the same CPU.  Fix
      exception guard to reflect that.
      
      This is needed for demand paging where a page fault from another thread
      can happen on the same CPU where a thread is sleeping in the page fault
      handler.
      
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      82301253
  17. Oct 10, 2013
    • Avi Kivity's avatar
      build: define _KERNEL everywhere · 95ce17e3
      Avi Kivity authored
      We have _KERNEL defines scattered throughout the code, which makes
      understanding it difficult.
      
      Define it just once, and adjust the source to build.
      
      We define it in an overridable variable, so that non-kernel imported code
      can undo it.
      95ce17e3
  18. Oct 01, 2013
    • Pekka Enberg's avatar
      x64: Enable sleeping in fault context · a449b889
      Pekka Enberg authored
      
      In preparation for enabling demand paging, enable sleeping in fault
      context by using a per-thread exception stack for normal faults and
      per-CPU exception stack for nested faults.
      
      Avi Kivity explains:
      
        Before [demand paging] can even hope to work, we need to enable
        sleeping in fault context.  Right now each cpu has its own exception
        stack, which leads immediately to stack corruption:
      
        thread 1 faults
        enters exception stack
        tries to take mutex
        scheduler switches to thread 2
        thread 2 faults
        enters same exception stack
      
        So we need to switch stacks.  This can be done in the same way as for
        interrupt stacks (see thread::switch_to()).
      
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      a449b889
  19. Sep 30, 2013
  20. Sep 21, 2013
    • Glauber Costa's avatar
      xen: use c++ interrupt handler · ea4cb9f6
      Glauber Costa authored
      
      Now that we have an efficient interrupt handler, use it.No need to delete the
      old bsd code, just to avoid disrupting the file too much. Make sure through
      an assertion that it is never used, though.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      ea4cb9f6
    • Glauber Costa's avatar
      xen: rework interrupt handler · a837afe5
      Glauber Costa authored
      
      This version of the Xen interrupt handler tries to do as less work as possible
      in the interrupt itself. The previous version and my previous fix attempt would
      still clean the channels during interrupt.
      
      Because now we have pending_sel still set in the irq thread, we can ditch
      _irq_pending completely.
      
      There is now only one xen_irq for the entire system, and therefore I am
      registering one per cpu, since we will eventually have to process this in
      different cpus. (for different event channels).
      
      With this, in my (very course, host to guest) netperf test, I am achieving
      9600 * 10^6 bps, while linux can reach ~10000 * 10^bps. So we're getting close:
      
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       65536  16384  16384    10.00    9589.32
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      a837afe5
    • Glauber Costa's avatar
      xen: declare shared types as atomic · de6ba640
      Glauber Costa authored
      
      Some of the fields in the xen shared structure need to be accessed atomically.
      Move them to std::atomic so we can do that using C++11 primitives.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      de6ba640
  21. Sep 18, 2013
  22. Sep 15, 2013
  23. Sep 12, 2013
  24. Sep 11, 2013
  25. Sep 05, 2013
    • Glauber Costa's avatar
      boot16.S: open up space for partition table · 4a6d51d5
      Glauber Costa authored
      Because we will be copying the bootloader code to the beginning of the disk, make
      sure we won't step over the partition table space. This is technically not needed
      if the code is small enough, but this guard code will 1) make sure that doesn't
      happen, and 2) make sure the space is zeroed out.
      
      The signature though, is needed, and is set to the bytes "O", "S" and "V", which
      will span VSO in the end.
      4a6d51d5
    • Glauber Costa's avatar
      bootloader: move count32 variable · fcf173eb
      Glauber Costa authored
      It currently sits in the middle of the partition table. Move it to a safer
      location.
      fcf173eb
    • Glauber Costa's avatar
      acpi: move table initialization to its own constructor · bf15592d
      Glauber Costa authored
      Right now we are doing it right before we parse the MADT, but this is by far
      not MADT specific. Other users are planned, and the best way to resolve the
      disputes is to have it in a separate constructor
      bf15592d
  26. Aug 28, 2013
    • Glauber Costa's avatar
      work around xen x2apic bug · cc3d517a
      Glauber Costa authored
      The x2APIC specification says that reading from the X2APIC_ID MSR should return
      the physical apic id of the current processor. However, the Xen implementation
      (as of 4.2.2) is broken, and reads actually return old style xAPIC id. Even if
      they fix it, we still have HVs deployed around that will return the wrong ID.
      We can work around this by testing if the returned APIC id is in the form (id
      << 24), since in that case, the first 24 bits will all be zeroed. Then at least
      we can get this working everywhere. This may pose a problem if we want to ever
      support more than 1 << 24 vCPUs (or if any other HV has some random x2apic
      ids), but that is highly unlikely anyway.
      cc3d517a
    • Glauber Costa's avatar
      apic: bringup cpus individually instead of all at the same time · 5cb16020
      Glauber Costa authored
      As I have described in a previous patch, the Xen hypervisor has a very nasty
      bug that causes all of the x2apic msr writes to trigger a GPF. Although the
      request proceeds fine despite the GPF, it does bring a problem for all-but-self
      style init sequences we are using: after "failing" (succeeding but returning
      failure) to deliver the interrupt for the first cpu in the group, xen will
      break the loop, therefore not delivering the SIPIs to other cpus in the system
      at all. We can work around that by delivering interrupts to each cpu
      individually, instead of all-but-self.
      5cb16020
    • Glauber Costa's avatar
      implement wrmsr_safe · a7ea5784
      Glauber Costa authored
      Unfortunately, the Xen hypervisor has a very nasty bug (seems to be fixed by a
      2013 patch - which means that although it is fixed, a lot of hypervisors will
      have it), that causes all of the x2apic msr writes to init related registers
      (INIT, SIPI, etc) trigger a GPF. The way to work around this, is to implement a
      form of "wrmsr_safe".
      a7ea5784
Loading