Commits · c329764598f3d43e5ebfe0a6e90b10ae14a22860 · Verlässliche Systemsoftware / projects / osv

Jul 30, 2013

virtio-blk: pass on I/O errors · c3297645
Christoph Hellwig authored 11 years ago

c3297645

virtio-blk: don't truncate bio_offset · 3c67c099

Without this we get errors reading/writing from larger imageṡ.  Our current
10g usr.img actually is enough to trigger this, but our lack of error handling
papered over it so far.

3c67c099

Jul 29, 2013

bio: change bio_list to bio_queue · 8593a8c5

Glauber Costa authored 11 years ago

bio queue is the name used by BSD. Since it is just a name difference,
I would better change our code, since there are few users (only ramdisk),
than to patch all code I am importing from BSD that uses it.

8593a8c5

Jul 28, 2013
- Implement networking offloads. · fa5e2b32
  Dor Laor authored 11 years ago
  
  Based on FreeBSD virtio code Provides a x7 boost for rx netperf
  fa5e2b32
- virtio-net: switch to new with_lock() · c666cc11
  Avi Kivity authored 11 years ago
  
  c666cc11
- virtio-blk: switch to new with_lock() · 37f25823
  Avi Kivity authored 11 years ago
  
  37f25823
- console: switch to new with_lock() · 203638dc
  Avi Kivity authored 11 years ago
  
  203638dc
- debug-console: switch to new with_lock() · 4af8b945
  Avi Kivity authored 11 years ago
  
  4af8b945
Jul 27, 2013
- virtio-blk: add cache flush support · 86fd1c16
  Christoph Hellwig authored 11 years ago
  
  86fd1c16
Jul 24, 2013

Fix memcached networking issue that was caused by unsecure, parallel device... · 5f1e97d6

Dor Laor authored 11 years ago

Fix memcached networking issue that was caused by unsecure, parallel device invocation. I wasn't aware that the a device lock has to be held on the tx callback. Will look into it deeper tomorrow. The patch solves the issue

5f1e97d6

Rename _lock to _tx_gc_lock · e2ba7581
Dor Laor authored 11 years ago

e2ba7581

Use wake_with scheme in order not to wake w/ the lock held · 2379771f

Dor Laor authored 11 years ago

This way it's possible to wake a thread while holding the lock
that protects the thread pointer of going away. The lock itself
won't be held by the waker and thus the wakee will be able to
use it immedietly w/o ctx. Suggested by Nadav.

2379771f

Jul 18, 2013
- Reorganize startup order · 5984eb5d
  Avi Kivity authored 11 years ago
  
  Make the early allocator available earlier to support the dynamic per-cpu allocator.
  5984eb5d
- Fix long->int trancation, identified by Christoph · d3421d25
  Dor Laor authored 11 years ago
  
  d3421d25
Jul 17, 2013
- Cleanup various old unrelevant comments. · 93aeb061
  Dor Laor authored 11 years ago
  
  No code change.
  93aeb061
- Queue blk requests when the ring is empty · 69e1182d
  Dor Laor authored 11 years ago
  
  Instead of cancelling block requests due to no space on the ring that lead to corruption of the upper layer, block until there is space.
  69e1182d
Jul 11, 2013

virtio: explicitly request contiguous memory for the virtio ring · 79aa5d28
Avi Kivity authored 11 years ago
```
Required by the virtio spec.
```
79aa5d28

Move from a request array approach back to allocation. · 5bcb95d9

Dor Laor authored 11 years ago

virtio_blk pre-allocates requests into a cache to avoid re-allocation
(possibly an unneeded optimization with the current allocator).  However,
it doesn't take into account that requests can be completed out-of-order,
and simply reuses requests in a cyclic order. Noted by Avi although
I had it made using a peak into the index ring but its too complex
solution. There is no performance degradation w/ smp due to the good
allocator we have today.

5bcb95d9

Fix hang in virtio_driver::wait_for_queue · 8ebb1693

Nadav Har'El authored 11 years ago

virtio_driver::wait_for_queue() would often hang in a memcached and
mc_benchmark workload, waiting forever for received packets although
these *do* arrive.

As part of the virtio protocol, we need to set the host notification
flag (we call this, somewhat confusingly, queue->enable_interrupts())
and then check if there's anything in the queue, and if not, wait
for the interrupt.

This order is important: If we check the queue and only then set the
notification flag, and data came in between those, the check will be
empty and an interrupt never sent - and we can wait indefinitely for
data that has already arrived.

We did this in the right order, but the host code, running on a
different CPU, might see memory accesses in a different order!
We need a memory fence to ensure that the same order is also seen
on other processors.

This patch adds a memory fence to the end of the enable_interrupts()
function itself, so we can continue to use it as before in
wait_for_queue(). Note that we do *not* add a memory fence to
disable_interrupts() - because no current use (and no expected use)
cares about the ordering of disable_interrupts() vs other memory
accesses.

8ebb1693

Revert · 8d48ef43

Nadav Har'El authored 11 years ago

I'm returning Dor's original virtio_driver::wait_for_queue().

The rewrite just masked, with its slightly different timing and redundant
second check before waiting, the real bug which a missing memory barrier
(see separate patch fixing that).

Dor's original code has the good feature that after waking up from a
sleep - when presumably we already have something in the queue - we
check the queue before pessimisticly enabling the host notifications.
So let's use Dor's original code.

8d48ef43

Jul 10, 2013

rewrite virtio_driver::wait_for_queue · 4c1dd505

Nadav Har'El authored 11 years ago

In my memcached tests (with mc_benchmark as the driver), I saw
virtio_driver::wait_for_queue appears to have some bug or race condition -
in some cases it hangs on waiting for the rx queue - and simply never
returns.

I can't say I understand what the bug in this code is, however.
Instead, I just wrote it from scratch in a different way, which I think
is much clearer - and this code no longer exhibits this bug.

I can't put my finger on why my new version is more correct than
the old one - or even just difference... Dor, maybe you can find a
difference? But it definitely behaves differently.

4c1dd505

Allow parallel execution of {add|get}_buff, prevent fast path allocs · 350fa518

Dor Laor authored 11 years ago

virtio-vring and it's users (net/blk) were changed so no request
header will be allocated on run time except for init. In order to
do that, I have to change get_buf and break it into multiple parts:

        // Get the top item from the used ring
        void* get_buf_elem(u32 *len);
        // Let the host know we consumed the used entry
        // We separate that from get_buf_elem so no one
        // will re-cycle the request header location until
        // we're finished with it in the upper layer
        void get_buf_finalize();
        // GC the used items that were already read to be emptied
        // within the ring. Should be called by add_buf
        // It was separated from the get_buf flow to allow parallelism of the two
        void get_buf_gc();

As a result, it was simple to get rid of the shared lock that protected
_avail_head variable before. Today only the thread that calls add_buf
updates this variable (add_buf calls get_buf_gc internally).

There are two new locks instead:
  - virtio-net tx_gc lock - very rarely it can be accessed
    by the tx_gc thread or normally by the tx xmit thread
  - virtio-blk make_requests - there are parallel requests

350fa518

Trivial: Move code above, preparation for preventing past path allocations for... · cc8cc19e
Dor Laor authored 11 years ago
```
Trivial: Move code above, preparation for preventing past path allocations for the virtio request data
```
cc8cc19e

Jul 08, 2013
- clock: conveniently add a nanotime() function · 524f7108
  Guy Zana authored 11 years ago
  
  returns the current time in nanoseconds.
  524f7108
Jul 04, 2013

Trivial: get rid of sglist entirely · 18beb1b6
Dor Laor authored 11 years ago

18beb1b6

Sglist virtio usage refactore · ddef97f6

Dor Laor authored 11 years ago

Use a single instance per queue vector of sglist data.
Before this patch sglist was implemented as a std::list
which caused it to allocate heap memory and travel through pointers.
Now we use a single vector per queue to temoprary keep the
buffer data between the upper virtio layer and the lower one.

ddef97f6

Trivial code movement, preparation for sglist changes. · 481f4ebd
Dor Laor authored 11 years ago

481f4ebd

Jul 03, 2013

Add a virtio_kick trace point · 37008a2b
Dor Laor authored 11 years ago

37008a2b

Use the right memory barriers when accessing the 'used' data. · 8bbb7d26

Dor Laor authored 11 years ago

When the guest reads the used pointer it must make sure that
all the other relevant writes to the descriptors by the
host are up to date.
The same goes for the other direction when the guest
updates the ring in get_buf - the write to used_event should
make all the descriptor changes visible to the host

8bbb7d26

Use nullptr instead of reinterpret_cast · 723e8efc
Dor Laor authored 11 years ago

723e8efc

Reduce singifincantly the amount of tx interrupts · 6a323929

Dor Laor authored 11 years ago

Instead of enabling interrupts for tx by the host when we have
a single used pkt in the ring, wait until we have 1/2 ring.
This improves the amount of tx irqs from one per pkt to practically
zero (note that we actively call tx_gc if there is not place on the
ring when doing tx). There was a 40% performance boost on the
netperf rx test

6a323929

Convert avail._idx to a std::atomic variable As a result, get rid of the... · be462b48

Dor Laor authored 11 years ago

Convert avail._idx to a std::atomic variable As a result, get rid of the compiler barrier calls since we use std:atomic load/store instead

be462b48

Use std::atomic load on the share guest:host index · 4f7085d4
Dor Laor authored 11 years ago

4f7085d4

Jul 01, 2013
- clock: change time type to s64 · 073dc612
  Avi Kivity authored 11 years ago
  
  Easier to compute deltas. Limits range to 292 years.
  073dc612
Jun 30, 2013

Implement event index, a virtio optimization · 26e046a3

Dor Laor authored 11 years ago

The optimization improves performance by letting each side
of the ring know what was the last index read by the remote
party. In case were the other side has older indexes, waiting
for processing we won't trigger another update (kick or irq,
depending on the direction).
The optimization reduces irq injection by a 7%.

Along the way, collapse vring::need_event to the code and use
std::atomic(s) to access any variable which is shared w/ the host

26e046a3

Jun 26, 2013

xen: implement paravirtual clock driver for xen · 70583a58

Glauber Costa authored 11 years ago

Unlike KVM, we won't use percpu variables because Xen already lays down
statically the shared info structure, that includes the vcpu info pointer
for each cpu.

We could in theory use percpu variables to store pointers to the current cpu
vcpu info, but I ended up giving up this route. Since our pcpu implementation
have the overhead of computing addresses anyway, we may as well pay the price
and compute it directly from the xen shared info.

One of the things that comes with it, is that we can compute precise timings
using xenclock very early. Since we don't have *that* much to do early, it is
unclear if KVM needs to be improved in this regard (probably not), so this
becomes just a slight bonus.

70583a58

kvmclock: move pvclock definitions to common header · 881345d2

Glauber Costa authored 11 years ago

The designer of kvmclock wrote it in a way so as to be ABI-compatible
with xen's pvclock. We can reuse the same structures, then, so let's do it.

881345d2

kvmclock: disable preemption for less time · 4317c8c4

Glauber Costa authored 11 years ago

Because we are now pre-computing wall clock, we only need preemption disabled during system
time calculation.

4317c8c4

Jun 25, 2013
- Revert "buildfix: pvclock xen definitions" · 043b4a31
  Glauber Costa authored 11 years ago
  
  This reverts commit dae99589.
  043b4a31
- buildfix: pvclock xen definitions · dae99589
  Glauber Costa authored 11 years ago
  
  dae99589