Commits · b2c6447217bd079c2469e57e9ce12f38f31b97dc · Verlässliche Systemsoftware / projects / osv

May 21, 2014

tls: move more details about initialization to arch · fadc5fdc

Claudio Fontana authored 10 years ago


the thread_control_block structure needs to be different
between x64 and AArch64;
For AArch64's implementation for local execution, try to
match the layout in glibc and the generated code.

Do not align .tdata and .tbss sections with .tdata : ALIGN(64)
or it will affect the TLS loads.

Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Cc: Glauber Costa <glommer@cloudius-systems.com>
Cc: Will Newton <will.newton@linaro.org>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

fadc5fdc

May 16, 2014

loader: move leftover arch-dependent code to arch · d695a2cf

Claudio Fontana authored 10 years ago


move driver setup and console creation to arch-setup,
and ioapic init for x64 to smp_launch, so that we
can remove ifdefs and increase the amount of common code.

Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>

d695a2cf

aarch64: loader: allow execution to main_cont · 6a9f46ec

Claudio Fontana authored 10 years ago


allow execution to flow until main_cont
so we can reach the backtrace.

Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>

6a9f46ec

aarch64: implement fixup fault and backtrace · 706c8b63

Jani Kokkonen authored 10 years ago


implement fixup fault and the backtrace functionality which is
its first simple user.

Signed-off-by: Jani Kokkonen <jani.kokkonen@huawei.com>

[claudio: added elf changes to allow lookup and demangling to work]

Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>

706c8b63

May 14, 2014

loader: use early console to display OSV string · 16d33f18

Claudio Fontana authored 10 years ago


and do it early (before the loop around init_array)

Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

16d33f18

trace: introduce sampling profiler · b3fa77d3

Tomasz Grabiec authored 10 years ago


This introduces a simple timer-based sampling profiler which is
reusing our tracing infrastructure to collect samples.

To enable sampler from run.py run it like this:

 $ scripts/run.py ... --sampler [frequency]

Where 'frequency' is an optional parameter for overriding sampling
frequency. The default is 1000 (ticks per second). The bigger the
frequency the bigger sampling overhead is. Too low values will hurt
profile accuracy.

Ad-hoc sampler enabling is planned. The code already takes that into
account.

To see the profile you need to extract the trace:

 $ trace extract

And then show it like this:

 $ trace prof

All 'prof' options can be applied, for example you can group by CPU:

 $ trace prof -g cpu

Signed-off-by: Tomasz Grabiec <tgrabiec@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

b3fa77d3

debug: allow debug_early to work really early again · 56996315

Claudio Fontana authored 10 years ago


an effect of commit 9bbbe9dc is that no output is possible
before prio 'console' initializers have been run.
This change allows to have at least one API available
really early (from boot code and premain).
Document the requirements for the early console class
regarding the write() method.

Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

56996315

May 09, 2014

loader: poweroff() if cpus.size()>max_cpus · 0dfc7a8b

Jaspal Singh Dhillon authored 10 years ago


This patch fixes the case of silently hanging of OSv when >64 cpus are
provided.

Signed-off-by: Jaspal Singh Dhillon <jaspal.iiith@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

0dfc7a8b

May 05, 2014

Replace --vga with --console · 6e6d60bc

Takuya ASADA authored 10 years ago


Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

6e6d60bc

Fix including header for console · 5103b339

Takuya ASADA authored 10 years ago


Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

5103b339

Apr 11, 2014

loader: power off when command line is invalid · 129ac4bb

Nadav Har'El authored 10 years ago

Currently, in several cases when a bad command line is set in the image,
such as an empty command line (as in "make image=empty") or one with
invalid paramters (e.g., run.py -e "-a a"), we use abort(). abort() has two
annoying "features" - it hangs the VM forever, and shows an ugly stack
trace. Both are useful for a debugging - but it doesn't make sense to use
a debugger when just the command line is misconfigured - we just need to
print a message and power off the VM.

Calling osv::poweroff() in this early time during the boot is fine after
the previous patch which fixed osv::poweroff().

By the way, running a non-existant file (e.g., 'run.py -e a') already
had this correct behavior of powering off, not hanging.

Reviewed-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

129ac4bb

Apr 02, 2014

aarch64: install simple vectors and get the cmdline · 3751c5b1

Claudio Fontana authored 11 years ago


get the command line and elf header start, then try to
stay clear of apparent random limitations during boot
early stage with the model, and set up vectors as soon
as possible, to enable some minimal post-mortem info.

Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>

3751c5b1

loader, runtime: add stubs for AArch64 · 8e776790
Claudio Fontana authored 11 years ago
```
Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
```
8e776790

v3 RCU: Per-CPU rcu_defer() · e5fc1f1b

Nadav Har'El authored 10 years ago


Changes in v3, following Avi's review:
* Use WITH_LOCK(migration_lock) instead of migrate_disable()/enable().
* Make the global RCU "generation" counter a static class variable,
  instead of static function variable. Rename it "next_generation"
  (the name "generation" was grossly overloaded previously)
* In rcu_synchronize(), use migration_lock to be sure we wake up the
  thread to which we just added work.
* Use thread_handle, instead of thread*, for percpu_quiescent_state_thread.
  This is safer (atomic variable, so we can't see it half-set on some
  esoteric CPU), and cleaner (no need to check t!=0). Thread_handle is
  a bit of an overkill here, but it's not in a performance sensitive area.

The existing rcu_defer() used a global list of deferred work, protected by
a global mutex. It also woke up the cleanup thread on every call. These
decisions made rcu_dispose() noticably slower than a regular delete, to the
point that when commit 70502950 introduced
an rcu_dispose() to every poll() call, we saw performance of UDP memcached,
which calls poll() on every request, drop by as much as 40%.

The slowness of rcu_defer() was even more apparent in an artificial benchmark
which repeatedly calls new and rcu_dispose from one or several concurrent
threads. While on my machine a new/delete pair takes 24 ns, a new/rcu_dispose
from a single thread (on a 4 cpus VM) takes a whopping 330 ns, and worse -
when we have 4 threads on 4 cpus in a tight new/rcu_dispose loop, the mutex
contention, the fact we free the memory on the "wrong" cpu, and the excessive
context switches all bring the measurement to as much as 12,000 ns.

With this patch the new/rcu_dispose numbers are down to 60 ns on a single
thread (on 4 cpus) and 111 ns on 4 concurrent threads (on 4 cpus). This is
a x5.5 - x120 speedup :-)

This patch replaces the single list of functions with a per-cpu list.
rcu_defer() can add more callbacks to this per-cpu list without a mutex,
and instead of a single "garbage collection" thread running these callbacks,
the per-cpu RCU thread, which we already had, is the one that runs the work
deferred on this cpu's list. This per-cpu work is particularly effective
for free() work (i.e., rcu_dispose()) because it is faster to free memory
on the same CPU where it was allocated. This patch also eliminates the
single "garbage collection" thread which the previous code needed.

The per-CPU work queue has a fixed size, currently set to 2000 functions.
It is actually a double-buffer, so we can continue to accumulate more work
while cleaning up; If rcu_defer() is used so quickly that it outpaces the
cleanup, rcu_defer() will wait while the buffer is no longer full.
The choice of buffer size is a tradeoff between speed and memory: a larger
buffer means fewer context switches (between the thread doing rcu_defer()
and the RCU thread doing the cleanup), but also more memory temporarily
being used by unfreed objects.

Unlike the previous code, we do not wake up the cleanup thread after
every rcu_defer(). When the RCU cleanup work is frequent but still small
relative to the main work of the application (e.g., memcached server),
the RCU cleanup thread would always have low runtime which meant we suffered
a context switch on almost every wakeup of this thread by rcu_defer().
In this patch, we only wake up the cleanup thread when the buffer becomes
full, so we have far fewer context switches. This means that currently
rcu_defer() may delay the cleanup an unbounded amount of time. This is
normally not a problem, and when it it, namely in rcu_synchronize(),
we wake up the thread immediately.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

e5fc1f1b

Apr 01, 2014

Revert "rcu: Per-CPU rcu_defer()" · 6d68d1ab

Avi Kivity authored 10 years ago


This reverts commit d24cda2c.  It wants
migration_lock to be merged first.

Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

6d68d1ab

rcu: Per-CPU rcu_defer() · d24cda2c

Nadav Har'El authored 10 years ago

The existing rcu_defer() used a global list of deferred work, protected by
a global mutex. It also woke up the cleanup thread on every call. These
decisions made rcu_dispose() noticably slower than a regular delete, to the
point that when commit 70502950 introduced
an rcu_dispose() to every poll() call, we saw performance of UDP memcached,
which calls poll() on every request, drop by as much as 40%.

The slowness of rcu_defer() was even more apparent in an artificial benchmark
which repeatedly calls new and rcu_dispose from one or several concurrent
threads. While on my machine a new/delete pair takes 24 ns, a new/rcu_dispose
from a single thread (on a 4 cpus VM) takes a whopping 330 ns, and worse -
when we have 4 threads on 4 cpus in a tight new/rcu_dispose loop, the mutex
contention, the fact we free the memory on the "wrong" cpu, and the excessive
context switches all bring the measurement to as much as 12,000 ns.

With this patch the new/rcu_dispose numbers are down to 60 ns on a single
thread (on 4 cpus) and 111 ns on 4 concurrent threads (on 4 cpus). This is
a x5.5 - x120 speedup :-)

This patch replaces the single list of functions with a per-cpu list.
rcu_defer() can add more callbacks to this per-cpu list without a mutex,
and instead of a single "garbage collection" thread running these callbacks,
the per-cpu RCU thread, which we already had, is the one that runs the work
deferred on this cpu's list. This per-cpu work is particularly effective
for free() work (i.e., rcu_dispose()) because it is faster to free memory
on the same CPU where it was allocated. This patch also eliminates the
single "garbage collection" thread which the previous code needed.

The per-CPU work queue has a fixed size, currently set to 2000 functions.
It is actually a double-buffer, so we can continue to accumulate more work
while cleaning up; If rcu_defer() is used so quickly that it outpaces the
cleanup, rcu_defer() will wait while the buffer is no longer full.
The choice of buffer size is a tradeoff between speed and memory: a larger
buffer means fewer context switches (between the thread doing rcu_defer()
and the RCU thread doing the cleanup), but also more memory temporarily
being used by unfreed objects.

Unlike the previous code, we do not wake up the cleanup thread after
every rcu_defer(). When the RCU cleanup work is frequent but still small
relative to the main work of the application (e.g., memcached server),
the RCU cleanup thread would always have low runtime which meant we suffered
a context switch on almost every wakeup of this thread by rcu_defer().
In this patch, we only wake up the cleanup thread when the buffer becomes
full, so we have far fewer context switches. This means that currently
rcu_defer() may delay the cleanup an unbounded amount of time. This is
normally not a problem, and when it it, namely in rcu_synchronize(),
we wake up the thread immediately.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

d24cda2c

Mar 27, 2014

drivers: Add zfs device to allow use of zfs commands · ff3534e2

Raphael S. Carvalho authored 10 years ago


Previously, zfs device was being only provided to allow the use of
commands needed to create the zpool, and so the file system.
At that time, doing so was quite enough, however, making zfs
device, i.e. /dev/zfs part of every OSv instance would allow us
to use commands that will help analysing, debugging, tuning
the zpool and file systems there contained.

The basic explanation is that those commands use libzfs which in
turn relies on /dev/zfs to communicate with the zfs code.

Commands example:
zpool, zfs, zdb. The latter one not being ported to OSv yet.
This patch will also be helpful for the ongoing ztest porting.

Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

ff3534e2

Mar 25, 2014

loader: Print OSv version info correctly · c9c94f51

Asias He authored 11 years ago


On VBOX and VMW, the version info is not printed correctly.
Fix it by only print after our console is initialized.

Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Asias He <asias@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

c9c94f51

Mar 24, 2014

Add vmxnet3 driver · 7a78ad84

Takuya ASADA authored 11 years ago


Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

7a78ad84

Mar 06, 2014

vmw-pvscsi: Initial support · 81fdc730

Asias He authored 11 years ago


This driver is for VMware's pvscsi disk. It has better performance than
using AHCI device in VMware. This driver uses the common scsi code in
scsi-common.

This driver is written from scratch. QEMU and Linux pvscsi drivers were
used as reference as there's no specification available.

Tested on QEMU's pvscsi implementation and VMware Workstation.

Signed-off-by: Asias He <asias@cloudius-systems.com>

81fdc730

Mar 04, 2014

loader: remove unused declaration · f19c137a

Nadav Har'El authored 11 years ago


Removed an unused declaration, which is unnecessary and causes a warning
in Eclipse.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

f19c137a

Feb 12, 2014

ahci: Initial support · 1ee8e2d1

Asias He authored 11 years ago


AHCI is supported on various VMM, e.g. Virtual Box, VMware Workstation.
Adding AHCI support enables OSv to run on them if the para-virtualized
block device is not present or not supported yet.

Tested on VirtualBox, VMware Workstation and QEMU.

Signed-off-by: Asias He <asias@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

1ee8e2d1

Feb 11, 2014

loader: move x64-specific stuff from premain · a06c22d7

Claudio Fontana authored 11 years ago


move the arch-specific stuff in premain to
arch/x64/arch-setup.cc.

Introduce arch_init_premain() and arch_setup_tls().

arch_init_premain() is supposed to perform arch-specific
initialization before the common premain code is run.

arch_setup_tls() is run _after_ the common setup_tls code.

Reviewed-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

a06c22d7

Feb 07, 2014

loader: add a bootchart option. · 2209c95a

Glauber Costa authored 11 years ago


When booting with --bootchart, OSv will print a summary of where is our boot
time being spent up to the point right before our execution of main.

This mechanism can be extended later to keep measuring it later using other
facilities to account for the application, etc.

Example output:

OSv v0.05-156-gd3918a1
    disk read (real mode): 132.94ms, (+132.94ms)
    .init functions: 146.10ms, (+13.16ms)
    SMP launched: 147.57ms, (+1.47ms)
    RCU initialized: 150.61ms, (+3.04ms)
    VFS initialized: 154.08ms, (+3.46ms)
    Network initialized: 160.79ms, (+6.71ms)
    pvpanic done: 162.31ms, (+1.52ms)
    pci enumerated: 171.45ms, (+9.14ms)
    drivers probe: 171.46ms, (+0.02ms)
    drivers loaded: 182.52ms, (+11.06ms)
    ZFS mounted: 2116.32ms, (+1933.80ms)
    Total time: 2116.70ms, (+0.38ms)

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

2209c95a

loader: measure some key points · 61842f11

Glauber Costa authored 11 years ago


Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

61842f11

Feb 06, 2014

Fix loader.cc parallel running bug · 8947cb9a

Nadav Har'El authored 11 years ago


When running a command in the background, do_main_thread() passes the
command line in a std::vector pointer to a new pthread. Unfortunately,
soon afterwards the vector can go out of scope and the result is a
crash. Fix this oversight.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

8947cb9a

bsd: Register ARC shrinker · 0560d33f

Raphael S. Carvalho authored 11 years ago


This patch registers the ARC shrinker by using the event handler
list from BSD side. When ARC is initialized, it inserts the lowmem event
handler into an external event handler list. lowmem basically signals
the reclaiming thread which will then wake up to decide which approach
should be used to shrink the ARC.

The memory pressure on OSv is activated when the 20% watermark is
reached, so the shrink policy will decide which shrinker should
be called on such events.

bsd_shrinker_init is the responsible to find the lowmem event handler
from the external list, and integrate it into our shrinker infrastructure.

arc_lowmem needed few changes to return the amount of released memory
from the ARC.

Glauber and I tested the functionality by filling up the ARC up
to its target, then allocating as much memory as possible to see
if the ARC shrinker would take place to release memory back to
the operating system.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

0560d33f

ide: register ide driver · e4b60706

Takuya ASADA authored 11 years ago


Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>

e4b60706

Jan 27, 2014

clock: Remove unnecessary #include <drivers/clock.hh> · 8bf2fedd

Nadav Har'El authored 11 years ago


Remove unused #include of <drivers/clock.hh>.
Except the clock drivers and <osv/clock.hh>, no source file now now
include this header. Rather, <osv/clock.hh> should be used. Code including
<sched.hh> will also get <osv/clock.hh> automatically.

Reviewed-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

8bf2fedd

clock: Avoid #include <drivers/clockevent.hh> · e1c4aa82

Nadav Har'El authored 11 years ago


Several source files include <drivers/clockevent.hh>, though this is a
very low-level feature which they don't actually use.

sched.cc does use <drivers/clockevent.hh>, but already gets it through
sched.hh, so also doesn't need to include it explicitly.

This patch removes the unnecessary includes.

Reviewed-by: Glauber Costa <glommer@cloudius-systems.com>
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

e1c4aa82

Jan 22, 2014

loader: Add support for "&" in command line · 995fe31a

Nadav Har'El authored 11 years ago


Our loader's command line (what is given to the "-e" option of run.py)
already allows running multiple commands (each a shared object with
arguments) separated with a semicolon - e.g.,

    run.py -e "program1.so; program2.so; program3.so"

This patch allows, just like in Unix, to use a "&" instead of a ";",
in which case the preceding program is run in the background, in our case
this means in a new thread.

For example,
    run.py -e "httpserver.so& java.so ..."

As before a command line can constitute multiple commands, and whitespaces
around the separators (; or &) are optional.

Take care if you intend to run the *same* object multiple times concurrently,
e.g., "something.so& something.so". For an object to support this use case,
it should support its main() being called in parallel, and in particular
avoid using global variables.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

995fe31a

include: Move debug.hh to include/osv · 7809519b
Pekka Enberg authored 11 years ago
```
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
```
7809519b
include: Move mempool.hh to include/osv · 9c95f49d
Pekka Enberg authored 11 years ago
```
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
```
9c95f49d
include: Move tls.hh to include/osv · f7e2eb41
Pekka Enberg authored 11 years ago
```
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
```
f7e2eb41
include: Move dhcp.hh to include/osv · f880005c
Pekka Enberg authored 11 years ago
```
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
```
f880005c
include: Move elf.hh to include/osv · b8034e34
Pekka Enberg authored 11 years ago
```
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
```
b8034e34
include: Move commands.hh to include/osv · 86110819
Pekka Enberg authored 11 years ago
```
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
```
86110819
include: Move barrier.hh to include/osv · c80be886
Pekka Enberg authored 11 years ago
```
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
```
c80be886
include: Move sched.hh to include/osv · fae5693e
Pekka Enberg authored 11 years ago
```
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
```
fae5693e

Jan 21, 2014

loader: add cwd and env options · 33034397

Nadav Har'El authored 11 years ago


Add cwd (current directory) and env (environment variable) option to the
loader. Can be useful for certain applications that expect to be in a
certain directory, or certain environment variables to exist.

Example usage:

   run.py -e "--cwd=/tmp /usr/bin/something.so"

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

33034397