Skip to content
Snippets Groups Projects
  1. Sep 12, 2013
    • Dmitry Fleytman's avatar
      Support for Xen w/o vector callbacks · 1d3e336c
      Dmitry Fleytman authored
      This patch implements GSI interrupt support for Xen bus.
      Needed in Xen environments w/o vector callbacks for HVM.
      One example of such an environment is Amazon EC2.
      1d3e336c
    • Dmitry Fleytman's avatar
      aeb82f51
    • Nadav Har'El's avatar
      Test: cpu load balancing test · b255c259
      Nadav Har'El authored
      This is a test for the effectiveness of our scheduler's load balancing while
      running several threads on several cpus.
      
      A full description of the test and its expected results is included in
      comments in the beginning of the code. but briefly, the test runs multiple
      concurrent busy-loop threads, and an additional "intermittent" thread (one that
      busy-loops for a short duration, and then sleeps), and expects that all busy
      threads will get their fair share of the CPU, and the intermittent thread
      won't bother them too much.
      
      Testing the current code, this tests demonstrates the following problems
      we have:
      
      1. Two busy-loop threads on 2 cpus are 5%-10% slower than just one.
         This is not kernel overhead (profiling show 100% of the time in the
         test's inner loop), and I see exactly the slowdown when running this
         test on the Linux host, so it might be related to the host's multitasking?
         For now, let's not worry about that.
      
      2. Much more worrying is that the intermittent thread sometimes (in about half
         the tests) causes us to only fully use one CPU, and of course get bad
         performance.
      
      3. In many of the tests involving more than 2 threads (2 threads +
         intermittent, or 4 threads) load balancing wasn't fair and some
         threads got more CPU than the others.
      
      Later I'll send patches to fix issues 2 and 3, which appear to happen because
      the load balancer thread doesn't run as often as it should, because of vruntime
      problems.
      b255c259
    • Dmitry Fleytman's avatar
      Readme updates for new build script · babbde69
      Dmitry Fleytman authored
      babbde69
    • Dmitry Fleytman's avatar
      29160b6b
  2. Sep 11, 2013
  3. Sep 10, 2013
    • Pekka Enberg's avatar
      gdb: Fix osv mmap memory layout · 9ee0d615
      Pekka Enberg authored
      Fix up memory layout of 'class vma' for 'osv mmap' gdb command.
      9ee0d615
    • Pekka Enberg's avatar
      mmu: Fix file-backed vma splitting · d72b550c
      Pekka Enberg authored
      Commit 3510a5ea ("mmu: File-backed VMAs") forgot to fix vma::split() to
      take file-backed mappings into account. Fix the problem by making
      vma::split() a virtual function and implementing it separately for
      file_vma.
      
      Spotted by Avi Kivity.
      d72b550c
    • Or Cohen's avatar
      Added basic readline configuration · c5c4534c
      Or Cohen authored
      Parsed by JLine (in CRaSH)
      Console should now better understand keys like home/end/arrows
      c5c4534c
    • Or Cohen's avatar
      Merge branch 'stty-for-jni' · 8b0ea169
      Or Cohen authored
      8b0ea169
    • Or Cohen's avatar
      6b548713
    • Nadav Har'El's avatar
      DHCP: Fix crash · 68f4d147
      Nadav Har'El authored
      Rarely (about once every 20 runs) we had OSV crash during boot, in the
      DHCP code. It turns out that the code first sends out the DCHP requests,
      and then creates a thread to handle the replies. When a reply arrives,
      the code wake()s the thread, but on rare occasions the thread hasn't yet
      been set up (still a null pointer) so we have a crash.
      
      Fix this by reversing the order - first create the reply handling thread,
      and only then send the request.
      68f4d147
  4. Sep 09, 2013
  5. Sep 08, 2013
    • Nadav Har'El's avatar
      Scheduler: Fix load-balancer bug · e9f0cf29
      Nadav Har'El authored
      The load_balance() code checks if another CPU has fewer threads in its
      run queue than this thread, and if so, migrates one of this CPU's threads
      to the other CPU.
      
      However, when we count this core's runnable threads, we overcount it by
      1, because as soon as load_balance() goes back to sleep, one of the
      runnable threads will start running. So if this core has just one more
      runnable threads than some remote's core runnable threads, they are
      actually even, so in that case we should *not* migrate a thread.
      
      Overcounting the number of threads on the core running load_balance
      caused bad performance in 2-core and 2-thread SpecJVM: Normally, the
      size of the run queue on each core is 1 (each core is running one of
      the two threads, and on the run queue we have the idle thread). But
      when load_balance runs it sees 2 runnable threads (the idle thread and
      the preempted benchmark thread), and the second core has just 1, so
      it decides to migrate one of its threads to the second CPU. When this
      is over, the second CPU has both benchmark threads, and the first CPU
      has nothing, and this will only be fixed some time later when the
      second CPU's load_balance thread runs, and later the balance will be
      ruined again. All this time that the two threads run on the same CPU
      significantly hurt performance, and on the host's "top" we see qemu
      taking just 120%-150% instead of 200% as it should (and as it does
      after this patch).
      e9f0cf29
    • Nadav Har'El's avatar
      Scheduler: Avoid vruntime jump when clock jumps · 253e4536
      Nadav Har'El authored
      Currently, clock::get()->time() jumps (by system_time(), i.e., the host's
      uptime) at some point during the initialization. This can be a huge jump
      (e.g., a week if the host's uptime is a week). Fixing this jump is hard,
      so we'd rather just tolerate it.
      
      reschedule_from_interrupt() handles this clock jump badly. It calculates
      current_run, the amount of time the current thread has run, to include this
      jump while the thread was running. In the above example, a run time of
      a whole week is wrongly attributed to some thread, and added to its vruntime,
      causing it not to be scheduled again until all other threads yield the
      CPU.
      
      The fix in this patch is to limit the vruntime increase after a long
      run to max_slice (10ms). Even if a thread runs for longer (or just thinks
      it ran for longer), it won't be "penalized" in its dynamic priority more
      than a thread that ran for 10ms. Note that this cap makes sense, as
      cpu::enqueue already enforces a similar limit on the vruntime "bonus"
      of a woken thread, and this patch works toward a similar goal (avoid
      giving one thread a huge bonus because another thread was given a huge
      penalty).
      
      This bug is very visible in the CPU-bound SPECjvm2008 benchmarks, when
      running two benchmark threads on two virtual cpus. As it happens, the
      load_balancer() is the one that gets the huge vruntime increase, so
      it doesn't get to run until no other thread wants to run. Because we start
      with both CPU-bound threads on the same CPU, and these hardly yield the
      CPU (and even more rarely are the two threads sleeping at the same time),
      the load balancer thread on this CPU doesn't get to run, and the two threads
      remain on the same CPU, giving us halved performance (2-cpu performance
      identical to 1-cpu performance) and on the host we see qemu using 100% cpu,
      instead of 200% as expected with two vcpus.
      253e4536
    • Guy Zana's avatar
      a8d3a5ca
    • Guy Zana's avatar
    • Guy Zana's avatar
    • Guy Zana's avatar
      tests: tcp send-only test · e51ef872
      Guy Zana authored
      a test where the guest connects to the host and sends a small packet of data.
      used to verify that retransmits is working in Van Jacobson and the TCP
      stack in general.
      e51ef872
    • Guy Zana's avatar
    • Guy Zana's avatar
    • Avi Kivity's avatar
      build: fix sizing the image during a clean build · 9c10b784
      Avi Kivity authored
      The shell call to stat(1) is evaluted when the rule to build the image is
      triggered, at which point loader-stripped.elf does not exist yet.  This
      causes stat to fail and the build to break.
      
      Fix by moving the creation of loader-stripped.elf to its own target, so
      that by the time the recipe is evaluated, the file is known to exist.
      9c10b784
  6. Sep 06, 2013
  7. Sep 05, 2013
    • Glauber Costa's avatar
      build adaptions for single image · 16b47261
      Glauber Costa authored
      16b47261
    • Glauber Costa's avatar
      blkfront: mark device ready earlier · 7b0354b9
      Glauber Costa authored
      We cannot read the partition table from the device if the device is not marked
      as ready, since all IO will stall. I believe it should be fine to just mark the
      device ready before we mark our state as connected. With that change, it all
      proceed normally.
      7b0354b9
    • Glauber Costa's avatar
      call read partition table · b3e47d9a
      Glauber Costa authored
      I would like to call read_partition_table automatically from device_register,
      which would guarantee that every device that comes up have its partitions
      scanned. Although totally possible on KVM, it is not possible on Xen, due to
      the assynchronous nature of the bringup protocol: the device is exposed and
      created in a moment where IO is not yet possible, so reading the partition
      table will fail. Just read them both from the drivers when we are sure the
      driver is ready.
      b3e47d9a
    • Glauber Costa's avatar
      read partition table · 7fb8b99b
      Glauber Costa authored
      This code, living in device.c for maximum generality, will read the partition
      table from any disk that calls it. Ideally, each new device would have its own
      private data. But that would mean having to callback to the driver to set each
      of the partitions up. Therefore, I found it easier to convention that all
      partitions in the same drive have the same private data. This makes some sense
      if we consider that the hypervisors are usually agnostic about partitions, and
      all of the addressing and communications go through a single entry point, which
      is the disk.
      7fb8b99b
    • Glauber Costa's avatar
      add offset calculation · cd14aecc
      Glauber Costa authored
      To support multiple partitions to a disk, I found it easier to add a
      post-processing offset calculation to the bio just before calling the strategy.
      
      The reason is, we have many (really many) entry points for bio preparation
      (pre-strategy) and only two entry points for the strategy itself (the drivers).
      Since multiplex_strategy is a good thing to be used even for virtio (although I
      am not converting it now), since it allows for arbitrary sized requests, we
      could very well reduce it to just one.
      
      At this moment, the offset is always 0 and everything works as before.
      cd14aecc
    • Glauber Costa's avatar
      blk: derive size information from device · bfff3c6a
      Glauber Costa authored
      Currently we get it from the private data, but since I plan to use the same
      private data for all partitions, we need a unique value, that already exists in
      the device. So use it.
      bfff3c6a
    • Glauber Costa's avatar
      boot16.S: open up space for partition table · 4a6d51d5
      Glauber Costa authored
      Because we will be copying the bootloader code to the beginning of the disk, make
      sure we won't step over the partition table space. This is technically not needed
      if the code is small enough, but this guard code will 1) make sure that doesn't
      happen, and 2) make sure the space is zeroed out.
      
      The signature though, is needed, and is set to the bytes "O", "S" and "V", which
      will span VSO in the end.
      4a6d51d5
    • Glauber Costa's avatar
      imgedit: extend image editing script to deal with partitions · 716ad81d
      Glauber Costa authored
      Given a partition size and start address, this will edit the image passed as parameter
      to create a partition entry. This assumes the disk is always bigger than 8Gb while setting
      the CHS address. From osdev wiki:
      
      "For drives smaller than 8GB, the LBA fields and the CHS fields must "match"
       when the values are converted into the other format.  For drives bigger than
       8GB, generally the CHS fields are set to Cylinder = 1023, Head = 254 or 255,
       Sector = 63 -- which is considered an invalid setting."
      716ad81d
Loading