- Dec 19, 2013
-
-
Asias He authored
Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 18, 2013
-
-
Asias He authored
This adds initial virtio-scsi support. We have no scsi layer in osv, in this implementation virtio-scsi works directly with the bio layer. It translates BIO_READ, BIO_WRITE and BIO_FLUSH to SCSI CMD. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
The lock used to protect _waiting_request_thread can go away. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 17, 2013
-
-
Asias He authored
Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
We can skip to construct a vring::sg_node. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 16, 2013
-
-
Avi Kivity authored
bsd defines some m_ macros, for example m_flags, to save some typing. However if you have a variable of the same name in another header, for example m_flags, have fun trying to compile your code. Expand the code in place and eliminate the macros. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Clean up virtio-blk.cc by using 'auto' type specifier where possible. Reviewed-by:
Dor Laor <dor@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Vlad authored
Switched the virtio-net driver to use if_transmit() instead of legacy if_start(). This saves us at least 2 additional lock/unlock sequences per-each mbuf since IF_ENQUEUE() and IF_DEQUEUE() take lock when pushing/removing the mbuf from the queue if ifnet in a legacy mode. Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 11, 2013
-
-
Amnon Heiman authored
Separate /dev/random the virtio-rng driver and register virtio-rng as a HW RNG entropy source. Signed-off-by:
Amnon Heiman <amnon@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
This reduces unnecessary interrupts that host could send to guest while guest is in the progress of irq handling. In virtio_driver::wait_for_queue, we will re-enable interrupts when there is nothing to process. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 09, 2013
-
-
Asias He authored
Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
When I hacked use_indirect() to always use indirect buffer, I saw this assertion when running: $scripts/run.py -e "/tests/tst-bdev-write.so vblk1" VFS: mounting devfs at /dev 51.671 Mb/s Assertion failed: _status.load() == status::running (/home/asias/src/cloudius-systems/osv/core/sched.cc: prepare_wait: 655) Aborted It turned out that we are making a waiting thread waiting again. get_buf_gc() calls free which might make the thread in waiting state again. Suggested-by:
Dor Laor <dor@cloudius-systems.com> Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
It is useful to test if we can do gc on the used ring. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
When the _avail_count is less than 1/3 of the ring, we start using indirect descriptor. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Dor Laor <dor@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Asias He authored
There is no reason we should do the scale. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Dor Laor <dor@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
- Dec 08, 2013
-
-
Glauber Costa authored
Xen's shared info contains hardcoded space for only 32 CPUs. Because we use those structure to derive timing information, we would be basically accessing random memory after that. This is very hard to test and trigger, so what I'd did to demonstrate I was right (although that wasn't really needed, math could be used for that...) was to print the first timing information a cpu would produce. I could verify that the timing on CPUs > 32 was behind in time than the time produced in CPUs < 32. It is possible to move the vcpu area to a different location, but this is a relatively new feature of the Xen Hypervisor: Amazon won't support it. So we need a disable path anyway. I will open up an issue for somebody to implement that support eventually. Another user of the vcpu structure is interrupts. But for interrupts the story is easier, since we can select which CPUs we can take interrupts at, and only take them in the first 32 CPUs. In any case, we're taking them all in CPU0 now, so already under control Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 06, 2013
-
-
Asias He authored
This patch fixes: - The order of _used_ring_host_head and _used->_idx, the latter is more advanced than the former. - The unwanted promotion to "int" Pekka wrote: However, on the right-hand side, the expression type in master evaluates to "int" because of that innocent-looking constant "2" and lack of parenthesis after the cast. That will also force the left-hand side to promote to "int". And no, I really don't claim to follow integer promotion rules so I used typeid().name() verify what the compiler is doing: [penberg@localhost tmp]$ cat types.cpp #include <typeinfo> #include <stdint.h> #include <cstdio> using namespace std; int main() { unsigned int _num = 1; printf("int = %s\n", typeid(int).name()); printf("uint16_t = %s\n", typeid(uint16_t).name()); printf("(uint16_t)_num/2) = %s\n", typeid((uint16_t)_num/2).name()); printf("(uint16_t)(_num/2) = %s\n", typeid((uint16_t)(_num/2)).name()); } [penberg@localhost tmp]$ g++ -std=c++11 -Wall types.cpp [penberg@localhost tmp]$ ./a.out int = i uint16_t = t (uint16_t)_num/2) = i (uint16_t)(_num/2) = t Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
Now, the tx gc thread is gonna. The gc code can only be called in one place. We do not need the lock anymore. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
This unifies the code a bit: we do all the tx queue gc in one common code path. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
We do tx queue gc on the tx path if there is not enough space. The tx queue gc thread is not a must. Dropping it saves us a running thread and saves a thread wakeup on every interrupt. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Tomasz Grabiec authored
This is used by QEMU to determine if the guest will be using explicit flush requests. If this is not enabled, it will flush using fdatasync() after every write request. This is causing dramatic performance degradation when using spinning disk. This The test "tests/tst-bdev-write.so" exposes the issue: === Before === 0.469 Mb/s 0.312 Mb/s 0.323 Mb/s 0.354 Mb/s 0.163 Mb/s 0.100 Mb/s 0.388 Mb/s 0.293 Mb/s 0.401 Mb/s Written 3.117 MB in 10.02 s === After === 49.151 Mb/s 53.126 Mb/s 32.079 Mb/s 49.082 Mb/s 29.575 Mb/s 42.553 Mb/s 35.909 Mb/s 37.592 Mb/s 67.425 Mb/s Written 440.562 MB in 10.00 s Using "tests/tst-fs-stress.so": === Before === 2.414 Mb/s 3.633 Mb/s 0.630 Mb/s 0.279 Mb/s 2.497 Mb/s Written 15.379 MB in 10.51 s Latency of write() [s]: 0 0.000000090 0.5 0.000004532 0.9 0.000005969 0.99 0.000022659 0.999 0.001138458 1.0 4.020670891 === After === 11.893 Mb/s 20.292 Mb/s 13.801 Mb/s 16.102 Mb/s 24.811 Mb/s 18.113 Mb/s 21.336 Mb/s 18.976 Mb/s Written 182.254 MB in 10.00 s Latency of write() [s]: 0 0.000000089 0.5 0.000004497 0.9 0.000005878 0.99 0.000018114 0.999 0.000111873 1.0 0.681828260 Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 05, 2013
-
-
Asias He authored
Switch the existing printout based debug info to tracepoint and add new tracepoint. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 04, 2013
-
-
Avi Kivity authored
There is no need to hold the lock while waiting for the host to refill the entropy buffer; drop it. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
std::valarray does not guarantee its elements will be allocated contiguously, so the form &v[0] is only guaranteed to point to the first element, not the rest. Switch to std::vector, where contiguity is guaranteed. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Suppose N threads try to acquire a byte of entropy from an empty pool. They will all serialize on the mutex, waiting for the pool to refill. However, when the pool is eventually refilled, only one consumer will be awakened; the rest will continue sleeping even though there is entropy available in the pool. They will eventually be awakened when the worker refills the pool, but that's unneeded latency. Fix by using wake_all() to wake all consumers. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
It's a bad idea to claim to support /dev/urandom but rely on HW RNG because starting up Cassandra, for example, takes ages. Drop it until we have a cryptographically secure PRNG in OSv that can be used to implement /dev/urandom properly. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Avi Kivity authored
Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
- Dec 03, 2013
-
-
Avi Kivity authored
Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 01, 2013
-
-
Pekka Enberg authored
This adds the virtio-rng driver to OSv. The implementation is simple: - Start a thread that keeps 64 byte of entropy cached in internal buffer. Entropy is gathered from the host with virtio-rng. - Create device nodes for "/dev/random" and "/dev/urandom" that both use the same virtio_rng_read() hook. - Use the entropy buffer for virtio_rng_read(). If we exhaust the buffer, wake up the thread and wait for more entropy to appear. We eventually should move device node creation to separate drivers/random.c that multiplexes between different hardware RNG implementations. However, as we only support virtio-rng, I'm leaving that to whomever implements support for the next RNG. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Not all virtio devices support MSI. Fix device initialization by not writing to VIRTIO_MSI_QUEUE_VECTOR register if a PCI device does not advertise MSI-X support. This is needed to initialize virtio-rng devices when running on KVM/QEMU. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
When KVM paravirtual clock isn't available (e.g., on Xen or on plain Qemu), we used the HPET clock. Our HPET clock driver rolled back the clock (clock::get()->time()) once every 42 seconds, causing strange things like a scheduler assertion when the clock jumps back. The problem is that we read just 32 bits out of the 64 bits of the HPET counter. This means that we roll back the clock once every 2^32 ticks, and with the 10ns tick (which seems to be case in Qemu), this means about 42 seconds. Douglas Adams would have liked this bug ;-) Fixed the code, and removed overly-optimistic comment which stated the rollback should take years. Added an assertion that the HPET really has a 64-bit counter; Intel's HPET specification from 2004 already specify that a 64-bit counter is recommended, and both Qemu and Xen do implement a 64-bit counter. If we had to deal with a 32-bit counter, we would need to write a handler for the interrupt that the HPET sends every time the counter wraps around. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
- Nov 28, 2013
-
-
Pekka Enberg authored
Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Add a new virtio::probe() helper function to simplify virtio driver probing. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Nov 26, 2013
-
-
Nadav Har'El authored
This patch causes incorrect usage of percpu<>/PERCPU() to cause compilation errors instead of silent runtime corruptions. Thanks to Dmitry for first noticing this issue in xen_intr.cc (see his separate patch), and to Avi for suggesting a compile-time fix. With this patch: 1. Using percpu<...> to *define* a per-cpu variable fails compilation. Instead, PERCPU(...) must be used for the definition, which is important because it places the variable in the ".percpu" section. 2. If a *declaration* is needed additionally (e.g., for a static class member), percpu<...> must be used, not PERCPU(). Trying to use PERCPU() for declaration will cause a compilation error. 3. PERCPU() only works on statically-constructed objects - global variables, static function-variables and static class-members. Trying to use it on a dynamically-constructed object - stack variable, class field, or operator new - will cause a compilation error. With this patch, the bug in xen_intr.cc would have been caught at compile time. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Nov 21, 2013
-
-
Nadav Har'El authored
prio.hh defines various initialization priorities. The actual numbers don't matter, just the order between them. But when we add too many priorities between existing ones, we may hit a need to renumber. This is plain ugly, and reminds me of Basic programming ;-) So this patch switches to an enum (enum class, actually). We now just have a list of priority names in order, with no numbers. It would have been straightforward, if it weren't for a bug in GCC (see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59211 ) where the "init_priority" attribute doesn't accept the enum (while the "constructor" attribute does). Luckily, a simple workaround - explicitly casting to int - works. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Nov 06, 2013
-
-
Pekka Enberg authored
condvar_wait() expects an absolute time, not a duration. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-