Skip to content
Snippets Groups Projects
  1. Nov 25, 2013
    • Amnon Heiman's avatar
      Start up shell and management web in parallel · c29222c6
      Amnon Heiman authored
      
      Start up shell and management web in parallel to make boot faster.  Note
      that we also switch to latest mgmt.git which decouples JRuby and CRaSH
      startup.
      
      Signed-off-by: default avatarAmnon Heiman <amnon@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      c29222c6
    • Amnon Heiman's avatar
      java: Support for loading multiple mains · 10d6f18b
      Amnon Heiman authored
      
      When using the MultiJarLoader as the main class, it will use a
      configuration file for the java loading.  Each line in the file will be
      used to start a main, you can use -jar in each line or specify a main
      class.
      
      Signed-off-by: default avatarAmnon Heiman <amnon@cloudius-systems.com>
      Reviewed-by: default avatarTomasz Grabiec <tgrabiec@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      10d6f18b
    • Pekka Enberg's avatar
      tests: mincore() tests for demand paging · 20aad632
      Pekka Enberg authored
      
      As suggested by Nadav, add tests for mincore() interraction with demand
      paging.
      
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      20aad632
    • Pekka Enberg's avatar
      tests: Anonymous demand paging microbenchmark · d4bcf559
      Pekka Enberg authored
      
      This adds a simple mmap microbenchmark that can be run on both OSv and
      Linux.  The benchmark mmaps memory for various sizes and touches the
      mmap'd memory in 4K increments to fault in memory.  The benchmark also
      repeats the same tests using MAP_POPULATE for reference.
      
      OSv page faults are slightly slower than Linux on first iteration but
      faster on subsequent iterations after host operating system has faulted
      in memory for the guest.
      
      I've included full numbers on 2-core Sandy Bridge i7 for a OSv guest,
      Linux guest, and Linux host below:
      
        OSv guest
        ---------
      
        Iteration 1
      
             time (seconds)
         MiB demand populate
           1 0.004  0.000
           2 0.000  0.000
           4 0.000  0.000
           8 0.001  0.000
          16 0.003  0.000
          32 0.007  0.000
          64 0.013  0.000
         128 0.024  0.000
         256 0.052  0.001
         512 0.229  0.002
        1024 0.587  0.005
      
        Iteration 2
      
             time (seconds)
         MiB demand populate
           1 0.001  0.000
           2 0.000  0.000
           4 0.000  0.000
           8 0.001  0.000
          16 0.002  0.000
          32 0.004  0.000
          64 0.010  0.000
         128 0.019  0.001
         256 0.036  0.001
         512 0.069  0.002
        1024 0.137  0.005
      
        Iteration 3
      
             time (seconds)
         MiB demand populate
           1 0.001  0.000
           2 0.000  0.000
           4 0.000  0.000
           8 0.001  0.000
          16 0.002  0.000
          32 0.005  0.000
          64 0.010  0.000
         128 0.020  0.000
         256 0.039  0.001
         512 0.087  0.002
        1024 0.138  0.005
      
        Iteration 4
      
             time (seconds)
         MiB demand populate
           1 0.001  0.000
           2 0.000  0.000
           4 0.000  0.000
           8 0.001  0.000
          16 0.002  0.000
          32 0.004  0.000
          64 0.012  0.000
         128 0.025  0.001
         256 0.040  0.001
         512 0.082  0.002
        1024 0.138  0.005
      
        Iteration 5
      
             time (seconds)
         MiB demand populate
           1 0.001  0.000
           2 0.000  0.000
           4 0.000  0.000
           8 0.001  0.000
          16 0.002  0.000
          32 0.004  0.000
          64 0.012  0.000
         128 0.028  0.001
         256 0.040  0.001
         512 0.082  0.002
        1024 0.166  0.005
      
        Linux guest
        -----------
      
        Iteration 1
      
             time (seconds)
         MiB demand populate
           1 0.001  0.000
           2 0.001  0.000
           4 0.002  0.000
           8 0.003  0.000
          16 0.005  0.000
          32 0.008  0.000
          64 0.015  0.000
         128 0.151  0.001
         256 0.090  0.001
         512 0.266  0.003
        1024 0.401  0.006
      
        Iteration 2
      
             time (seconds)
         MiB demand populate
           1 0.000  0.000
           2 0.000  0.000
           4 0.001  0.000
           8 0.001  0.000
          16 0.002  0.000
          32 0.005  0.000
          64 0.009  0.000
         128 0.019  0.001
         256 0.037  0.001
         512 0.072  0.003
        1024 0.144  0.006
      
        Iteration 3
      
             time (seconds)
         MiB demand populate
           1 0.000  0.000
           2 0.001  0.000
           4 0.001  0.000
           8 0.001  0.000
          16 0.002  0.000
          32 0.005  0.000
          64 0.010  0.000
         128 0.019  0.001
         256 0.037  0.001
         512 0.072  0.003
        1024 0.143  0.006
      
        Iteration 4
      
             time (seconds)
         MiB demand populate
           1 0.000  0.000
           2 0.001  0.000
           4 0.001  0.000
           8 0.001  0.000
          16 0.003  0.000
          32 0.005  0.000
          64 0.010  0.000
         128 0.020  0.001
         256 0.038  0.001
         512 0.073  0.003
        1024 0.143  0.006
      
        Iteration 5
      
             time (seconds)
         MiB demand populate
           1 0.000  0.000
           2 0.001  0.000
           4 0.001  0.000
           8 0.001  0.000
          16 0.003  0.000
          32 0.005  0.000
          64 0.010  0.000
         128 0.020  0.001
         256 0.037  0.001
         512 0.072  0.003
        1024 0.144  0.006
      
        Linux host
        ----------
      
        Iteration 1
      
             time (seconds)
         MiB demand populate
           1 0.000  0.000
           2 0.001  0.000
           4 0.001  0.000
           8 0.001  0.000
          16 0.002  0.000
          32 0.005  0.000
          64 0.009  0.000
         128 0.019  0.001
         256 0.035  0.001
         512 0.152  0.003
        1024 0.286  0.011
      
        Iteration 2
      
             time (seconds)
         MiB demand populate
           1 0.000  0.000
           2 0.000  0.000
           4 0.001  0.000
           8 0.001  0.000
          16 0.002  0.000
          32 0.004  0.000
          64 0.010  0.000
         128 0.018  0.001
         256 0.035  0.001
         512 0.192  0.003
        1024 0.334  0.011
      
        Iteration 3
      
             time (seconds)
         MiB demand populate
           1 0.000  0.000
           2 0.000  0.000
           4 0.001  0.000
           8 0.001  0.000
          16 0.002  0.000
          32 0.004  0.000
          64 0.010  0.000
         128 0.018  0.001
         256 0.035  0.001
         512 0.194  0.003
        1024 0.329  0.011
      
        Iteration 4
      
             time (seconds)
         MiB demand populate
           1 0.000  0.000
           2 0.000  0.000
           4 0.001  0.000
           8 0.001  0.000
          16 0.002  0.000
          32 0.004  0.000
          64 0.010  0.000
         128 0.018  0.001
         256 0.036  0.001
         512 0.138  0.003
        1024 0.341  0.011
      
        Iteration 5
      
             time (seconds)
         MiB demand populate
           1 0.000  0.000
           2 0.000  0.000
           4 0.001  0.000
           8 0.001  0.000
          16 0.002  0.000
          32 0.004  0.000
          64 0.010  0.000
         128 0.018  0.001
         256 0.035  0.001
         512 0.135  0.002
        1024 0.324  0.011
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      d4bcf559
    • Pekka Enberg's avatar
    • Pekka Enberg's avatar
      mmu: Anonymous memory demand paging · c1d5fccb
      Pekka Enberg authored
      
      Switch to demand paging for anonymous virtual memory.
      
      I used SPECjvm2008 to verify performance impact. The numbers are mostly
      the same with few exceptions, most visible in the 'serial' benchmark.
      However, there's quite a lot of variance between SPECjvm2008 runs so I
      wouldn't read too much into them.
      
      As we need the demand paging mechanism and the performance numbers
      suggest that the implementation is reasonable, I'd merge the patch as-is
      and see optimize it later.
      
        Before:
      
          Running specJVM2008 benchmarks on an OSV guest.
          Score on compiler.compiler: 331.23 ops/m
          Score on compiler.sunflow: 131.87 ops/m
          Score on compress: 118.33 ops/m
          Score on crypto.aes: 41.34 ops/m
          Score on crypto.rsa: 204.12 ops/m
          Score on crypto.signverify: 196.49 ops/m
          Score on derby: 170.12 ops/m
          Score on mpegaudio: 70.37 ops/m
          Score on scimark.fft.large: 36.68 ops/m
          Score on scimark.lu.large: 13.43 ops/m
          Score on scimark.sor.large: 22.29 ops/m
          Score on scimark.sparse.large: 29.35 ops/m
          Score on scimark.fft.small: 195.19 ops/m
          Score on scimark.lu.small: 233.95 ops/m
          Score on scimark.sor.small: 90.86 ops/m
          Score on scimark.sparse.small: 64.11 ops/m
          Score on scimark.monte_carlo: 145.44 ops/m
          Score on serial: 94.95 ops/m
          Score on sunflow: 73.24 ops/m
          Score on xml.transform: 207.82 ops/m
          Score on xml.validation: 343.59 ops/m
      
        After:
      
          Score on compiler.compiler: 346.78 ops/m
          Score on compiler.sunflow: 132.58 ops/m
          Score on compress: 116.05 ops/m
          Score on crypto.aes: 40.26 ops/m
          Score on crypto.rsa: 206.67 ops/m
          Score on crypto.signverify: 194.47 ops/m
          Score on derby: 175.22 ops/m
          Score on mpegaudio: 76.18 ops/m
          Score on scimark.fft.large: 34.34 ops/m
          Score on scimark.lu.large: 15.00 ops/m
          Score on scimark.sor.large: 24.80 ops/m
          Score on scimark.sparse.large: 33.10 ops/m
          Score on scimark.fft.small: 168.67 ops/m
          Score on scimark.lu.small: 236.14 ops/m
          Score on scimark.sor.small: 110.77 ops/m
          Score on scimark.sparse.small: 121.29 ops/m
          Score on scimark.monte_carlo: 146.03 ops/m
          Score on serial: 87.03 ops/m
          Score on sunflow: 77.33 ops/m
          Score on xml.transform: 205.73 ops/m
          Score on xml.validation: 351.97 ops/m
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      c1d5fccb
    • Pekka Enberg's avatar
      mmu: Optimistic locking in populate() · 7e568ba0
      Pekka Enberg authored
      
      Use optimistic locking in populate() to make it robust against
      concurrent page faults.
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      7e568ba0
    • Pekka Enberg's avatar
      mmu: VMA permission flags · 8a56dc8c
      Pekka Enberg authored
      
      Add permission flags to VMAs. They will be used by mprotect() and the
      page fault handler.
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      8a56dc8c
    • Tomasz Grabiec's avatar
      loader.py: add commands for function duration analysis · af723084
      Tomasz Grabiec authored
      
      Duration analysis is based on trace pairs which follow the convention
      in which function entry generates trace named X and ends with either
      trace X_ret or X_err. Traces which do not have an accompanying return
      tracepoint are ignored.
      
      New commands:
      
        osv trace summary
      
            Prints execution time statistics for traces
      
        osv trace duration {function}
      
            Prints timed traces sorted by duration in descending order.
            Optionally narrowed down to a specified function
      
      gdb$ osv trace summary
      Execution times [ms]:
      name          count      min      50%      90%      99%    99.9%      max    total
      vfs_pwritev       3    0.682    1.042    1.078    1.078    1.078    1.078    2.801
      vfs_pwrite       32    0.006    1.986    3.313    6.816    6.816    6.816   53.007
      
      gdb$ osv trace duration
      0xffffc000671f0010  1    1385318632.103374   6.816 vfs_pwrite
      0xffffc0003bbef010  0    1385318637.929424   3.923 vfs_pwrite
      
      Signed-off-by: default avatarTomasz Grabiec <tgrabiec@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      af723084
    • Tomasz Grabiec's avatar
    • Tomasz Grabiec's avatar
      loader.py: add wrapper for intrusive list · 6cc939a6
      Tomasz Grabiec authored
      
      The iteration logic was duplicated in two places. The patches yet to
      come would add yet another place, so let's refactor first.
      
      Signed-off-by: default avatarTomasz Grabiec <tgrabiec@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      6cc939a6
    • Raphael S. Carvalho's avatar
      libc/network: feof shouldn't be used on a closed file · df6278fe
      Raphael S. Carvalho authored
      
      Calling feof on a closed file isn't safe, and the result is undefined.
      Found while auditing the code.
      
      Signed-off-by: default avatarRaphael S. Carvalho <raphaelsc@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      df6278fe
    • Avi Kivity's avatar
      sched: fix iteration across timer list · 9c3308f1
      Avi Kivity authored
      
      We iterate over the timer list using an iterator, but the timer list can
      change during iteration due to timers being re-inserted.
      
      Switch to just looking at the head of the list instead, maintaining no
      state across loop iterations.
      
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      Tested-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      9c3308f1
    • Avi Kivity's avatar
      sched: prevent a re-armed timer from being ignored · 870d8410
      Avi Kivity authored
      
      When a hardware timer fires, we walk over the timer list, expiring timers
      and erasing them from the list.
      
      This is all well and good, except that a timer may rearm itself in its
      callback (this only holds for timer_base clients, not sched::timer, which
      consumes its own callback).  If it does, we end up erasing it even though
      it wants to be triggered.
      
      Fix by checking for the armed state before erasing.
      
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      Tested-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      870d8410
    • Nadav Har'El's avatar
      Fix possible deadlock in condvar · 15a32ac8
      Nadav Har'El authored
      
      When a condvar's timeout and wakeup race, we wait for the concurrent
      wakeup to complete, so it won't crash. We did this wr.wait() with
      the condvar's internal mutex (m) locked, which was fine when this code
      was written; But now that we have wait morphing, wr.wait() waits not
      just for the wakeup to complete, but also for the user_mutex to become
      available. With m locked and us waiting for user_mutex, we're now in
      deadlock territory - because a common idiom of using a condvar is to
      do the locks in opposite order: lock user_mutex first and then use the
      condvar, which locks m.
      
      I can't think of an easy way to actually demonstrate this deadlock,
      short of having a locked condvar_wait timeout racing with condvar_wake_one
      racing and then an additional locked condvar operation coming in
      concurrently, but I don't have a test case demonstrating this.
      I am hoping it will fix the lockups that Pekka is seeing in his
      Cassandra tests (which are the reason I looked for possible condvar
      deadlocks in the first place).
      
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Tested-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      15a32ac8
    • Glauber Costa's avatar
      sched: delay initialization of early threads · d91d7799
      Glauber Costa authored
      
      The problem with sleep, is that we can initialize early threads before the
      cpu itself is initialized. If we note what goes on in init_on_cpu, it should
      become clear:
      
      void cpu::init_on_cpu()
      {
          arch.init_on_cpu();
          clock_event->setup_on_cpu();
      }
      
      When we finally initialize the clock_event, it can get lost if we already have
      pending timers of any kind - which we may, if we have early threads being
      start()ed before that. I have played with many potential solutions, but in the
      end, I think the most sensible thing to do is to delay initialization of early
      threads to the point when we are first idle. That is the best way to guarantee
      that everything will be properly initialized and running.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      d91d7799
  2. Nov 22, 2013
  3. Nov 21, 2013
  4. Nov 20, 2013
  5. Nov 19, 2013
Loading