Skip to content
Snippets Groups Projects
  1. Jan 10, 2014
    • Glauber Costa's avatar
      libc: support more time modes · bcc2fbb8
      Glauber Costa authored
      
      We are currently only answering requests for CLOCK_REALTIME, but we could
      easily handle:
      
          * CLOCK_REALTIME_COARSE, which is effective the same as CLOCK_REALTIME
            but faster. In our case, all time sources are equally fast
          * CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID, since we can
            easily get runtimes for our threads and publish that.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      bcc2fbb8
    • Pekka Enberg's avatar
      x64: Drop APIC base boot message · ba7250c9
      Pekka Enberg authored
      
      The "APIC base" message is not very useful to users. Drop it.
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      ba7250c9
    • Pekka Enberg's avatar
      x64: Simplify CPU bringup boot message · d291601d
      Pekka Enberg authored
      
      Currently, OSv prints out the following at boot:
      
        acpi 0 apic 0
        acpi 1 apic 1
        acpi 2 apic 2
        acpi 3 apic 3
      
      replace that with a simpler message:
      
        4 CPUs detected
      
      We do lose the ACPI ID -> CPU ID mapping but it is not terribly
      important for users.
      
      Suggested-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      d291601d
    • Pekka Enberg's avatar
      bsd: Simplify networking init message · c69e2d34
      Pekka Enberg authored
      
      Simplify networking boot initialization message as suggested by Tzach.
      
      Suggested-by: default avatarTzach Livyatan <tzach@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      c69e2d34
    • Pekka Enberg's avatar
      loader: Update Cloudius copyright · 7740d2cb
      Pekka Enberg authored
      
      It's 2014 now.
      
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      7740d2cb
    • Glauber Costa's avatar
      jvm: set max_heap to all available memory. · 8ea89c9c
      Glauber Costa authored
      
      We respect -Xmx when instructed by the user, but when that is left blank, we
      set that to be all remaining memory that we have. That is not 100 % perfect
      because the JVM itself will use some memory, but that should be good enough of
      an estimate. Specially given that some of the memory currently in use by OSv
      could be potentially freed in the future should we need it.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Reviewed-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      8ea89c9c
    • Glauber Costa's avatar
      jvm_balloon: disable balloon upon jvm memory pressure. · 0034af3f
      Glauber Costa authored
      
      The biggest problem I am seeing with the balloon is that right now the only
      time we call the balloon is when we're seeing memory pressure. If pressure is
      coming from the JVM, we can livelock in quite interesting ways. We need to
      detect that and disable the ballon in those situations, since ballooning when
      the pressure comes from the JVM will only trash our workloads.
      
      It's not yet working reliably, but this is the direction I plan to start from.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      0034af3f
    • Glauber Costa's avatar
      mm: Count total memory used by the JVM heap · 478f8746
      Glauber Costa authored
      
      To make informed reclaim decisions, we need to have as much relevant
      information as possible about our reclaim targets. Specifically, it
      is useful to know how much memory is currently used by the JVM heap.
      
      The reasoning behind this is that if pressure is coming from the heap,
      ballooning will harm us, instead of helping us.
      
      Note: This is really just a first approximation. Ideally, total memory
      shouldn't matter, but rather memory delta since a last common event.
      But counting memory is the initial first step for both.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      478f8746
    • Glauber Costa's avatar
      jvm: insert probe · b32a006b
      Glauber Costa authored
      
      To find out which vmas hold the Java heap, we will use a technique that is very
      close to ballooning (in the implementation, it is effectively the same)
      
      What we will do is we will insert a very small element (2 pages), and mark the
      vma where the object is present as containing the JVM heap. Due to the way the
      JVM allocates objects, that will end up in the young generation. As time
      passes, the object will move the same way the balloon moves, and every new vma
      that is seen will be marked as holding the JVM heap.
      
      That mechanism should work for every generational GC, which should encompass
      most of the JDK7 GCs (it not all). It shouldn't work with the G1GC, but that
      debuts at JDK8, and for that we can do something a lot simpler, namely: having
      the JVM to tell us in advance which map areas contain the heap.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      b32a006b
    • Glauber Costa's avatar
      java: memory pressure monitor · 88343714
      Glauber Costa authored
      
      The best possible criteria for deflating balloons is heap pressure: Whenever
      there is pressure in the JVM, we should give back memory so pressure stops.
      
      To accomplish that, we need to somehow tap into the JVM. This patch register
      a MXBean that will send us notifications about collections. We will ignore
      minor collections and act upon major collections by deflating any existing
      balloons.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      88343714
    • Glauber Costa's avatar
      jvm_balloon: control shrinker activation / deactivation · 52cb4738
      Glauber Costa authored
      
      There are restrictions on when and how a shrinker can run. For instance, if we
      have no balloons inflated, there is nothing to deflate (the relaxer should,
      then, be deactivated). Or also, when the JVM fails to allocate memory for an
      extra balloon, it is pointless to keep trying (which would only lead to
      unnecessary spins) until *at least* the next garbage collection phase.
      
      I believe this behavior of activation / deactivation ought to be shrinker
      specific. The reclaiming framework will only provide the infrastructure to do
      so.
      
      In this patch, the JVM Balloon uses that to inform the reclaimer when it makes
      sense for the shrinker or relaxer to be called.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      52cb4738
    • Glauber Costa's avatar
      JVM ballon driver · 9c59e7e8
      Glauber Costa authored
      
      This patch implements the JVM balloon driver, that is responsible for borrowing
      memory from the JVM when OSv is short on memory, and giving it back when we are
      plentiful. It works by allocating a java byte array, and then unmapping a large
      page-aligned region inside it (as big as our size allows).
      
      This array is good to go until the GC decides to move us. When that happens, we
      need to carefuly emulate the memcpy fault and put things back in place.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      9c59e7e8
    • Glauber Costa's avatar
      mmu: implement a new JVM vma · b657d2b3
      Glauber Costa authored
      
      After carrying on some testing, I quickly realized that the old fixup-only
      solution I was attempting for the ballooning was not really flying. The reason
      for that, is that we would take a fault, figure out the fixup address, and
      return.  If that wasn't a JVM fault, we were forced to take another fault
      (since we were already out of fault context).
      
      Once demand paging is a reality, the vast majority of the faults are for non
      balloon addresses, so we were effectively doubling our number of page faults
      for no reason. I have decided to go with the VMA (+fixups for instruction
      decoding) route after all. This is way more efficient and it seems to be
      working fine.
      
      The JVM vma is really close to the normal anonymous VMA. Except that it can
      never hold pages, and its fault handler calls into the JVM balloon facilities
      for decoding.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      b657d2b3
    • Glauber Costa's avatar
      mempool: shrink memory when no longer used. · 4afd087b
      Glauber Costa authored
      
      This patch introduces the memory reclaimer thread, which I hope to use to
      dispose of unused memory when pressure kicks in. "Pressure" right now is
      defined to be when we have only 20 % of total memory available. But that can be
      revisited.
      
      The way it will work is that each memory user that is able to dispose of its
      memory will register a shrinker, and the reclaimer will loop through them.
      However, the current "loop through all" only "works" because we have only one
      shrinker being registered. When other appears, we need better policies to drive
      how much to take, and from whom.
      
      Memory allocation will now wait if memory is not available, instead of
      aborting.  The decision of aborting should belong to the reclaimer and no one
      else.
      
      We should never expect to have an unbounded and more importantly, all opaque,
      number of shrinkers like Linux does. We have control of who they are and how
      they behave, so I expect that we will be able to make a lot better decisions
      in the long run.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      4afd087b
    • Glauber Costa's avatar
      semaphore: allow extending the interface · 21d9c318
      Glauber Costa authored
      
      Following an early suggestion from Nadav, I am trying to use semaphores for the
      balloon instead of keeping our own queue. For that to work, I need to have a bit
      more functionality that may not belong in the main balloon class. Namely:
      
      1) I need to query for the presence of waiters (and maybe in the future for the
      number of waiters)
      
      2) I need a special post that would allow me to make sure that we are almost posting
      at most as much we're waiting for, and nothing more.
      
      This patch transforms the post method in an unlocked version (and exposes a
      trivial version that just locks around it) and make other changes necessary to allow
      subclassing
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      21d9c318
    • Glauber Costa's avatar
      mmu: account evacuated size · ab459e83
      Glauber Costa authored
      
      This will be useful when we shrink, so we know how much memory we newly
      released for system consumption.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      ab459e83
    • Glauber Costa's avatar
      mmu: make operate quantifiable. · f1cd4f8d
      Glauber Costa authored
      
      operate so far operates in a page range and at the very most sets a success
      flag somewhere. I am here extending the API to allow it to return how much
      data it manipulated.
      
      So as an example, if we fault in 2Mb in an empty range, it will return 2 << 20.
      But if fault in the same 2Mb in a range that already contained some sparse 4k
      pages, we will return 2 << 20 - previous_pages.
      
      That will be useful to count memory usage in certain VMAs.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      f1cd4f8d
    • Glauber Costa's avatar
      string: add fixups for memcpy operations · 9cce0f87
      Glauber Costa authored
      
      When we start using the JVM balloon, our memcpy could fail for valid reasons
      when the JVM is moving memory that is now in an unmapped region. To handle that,
      register a fixup that will trigger a JVM call when the fault happens. If all goes
      well, we will be able to continue normally.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      9cce0f87
    • Takuya ASADA's avatar
      pci: Fix offsets in *_pci_config_* · f0aa8143
      Takuya ASADA authored
      On VMware, pci_readw(PCI_CFG_DEVICE_ID) returns the *vendor ID*.
      pci_readw(PCI_CFG_VENDOR_ID) returns vendor ID as well.
      
      Compare to FreeBSD implementation of read/write PCI config space,
      FreeBSD masks lower bit of offset when write to PCI_CONFIG_ADDRESS, and
      adds lower bit of offset to PCI_CONFIG_DATA.
      
      http://fxr.watson.org/fxr/source/amd64/pci/pci_cfgreg.c#L206
      
      
      
      This patch changes accessing method in OSv to the FreeBSD way.  Tested
      on QEMU/KVM and VMware.
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarTakuya ASADA <syuu@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      f0aa8143
    • Nadav Har'El's avatar
      clock: add monotonic uptime clock · 8dffa912
      Nadav Har'El authored
      
      This patch starts to solve both issue #142 ("Support MONOTONIC_CLOCK")
      and issue #81 (use <chrono> for time).
      
      First, it adds an uptime() function to the "clock" interface, and
      implements it for kvm/xen/hpet by returning the system time from which
      we subtract the system time at boot (but not adding any correction
      for wallclock).
      
      Second, it adds a new std::chrono-based interface to this clock, in
      a new header file <osv/clock.hh>. Instead of the old-style
      clock::get()->uptime(), one should prefer osv::clock::uptime::now().
      This returns a std::chrono::time_point which is type-safe, in the
      sense that: 1. It knows what its epoch is (i.e., that it belongs to
      osv::clock::uptime), and 2. It knows what its units are (nanoseconds).
      This allows the compiler to prevent a user from confusing measurements
      from this clock with those from other clocks, or making mistakes in
      its units.
      
      Third, this patch implements clock_gettime(MONOTONIC_CLOCK), using
      the new osv::clock::uptime::now().
      
      Note that though the new osv::clock::uptime is almost identical to
      std::chrono::steady_clock, they should not be confused. The former is
      actually OSv's implementation of the latter: steady_clock is implemented
      by the C++11 standard library using the Posix clock_gettime, and that
      is implemented (in this patch) using osv::clock::uptime.
      
      With this patch, we're *not* done with either issues #142 or #81.
      For issue #142, i.e., for supporting MONOTONIC_CLOCK in timerfd, we
      need OSv's timers to work on uptime(), not on clock::get()->time().
      For issue #81, we should add a osv::clock::wall type too (similar to
      what clock::get()->time() does today, but more correctly), and use either
      osv::clock::wall or osv::clock::uptime everywhere that
      clock::get()->time() is currently used in the code.
      clock::get()->time() should be removed.
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      8dffa912
    • Tomasz Grabiec's avatar
      build: incremental make without image= argument should use the default · 5c68e049
      Tomasz Grabiec authored
      
      Currently the parameter was read from the generated Makefile which was
      not re-generated on incremental build. The fix is to move the default
      to build.mk, this way the default will always be picked unless masked
      by command line argument.
      
      Fixes #153
      
      Signed-off-by: default avatarTomasz Grabiec <tgrabiec@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      5c68e049
  2. Jan 09, 2014
  3. Jan 08, 2014
  4. Jan 07, 2014
Loading