- Feb 13, 2014
-
-
Eduardo Piva authored
The lzloader itself is very simple, it read all data that was compressed using the lz command line tool and uncompress it on the address the kernel should be running, i.e.: 0x200000. The MAX_BUFFER is the address range from 0x200000 to 0x0x1800000. _binary_loader_stripped_elf_lz_start is an extern symbol to the compressed kernel image. This is created during the build process using objcopy command line tool. Signed-off-by:
Eduardo Piva <efpiva@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Eduardo Piva authored
Fastlz library provides two default tools for compression and uncompression: 6pack and 6unpack. We are not using those tools since they work with chuncks and checksums, because they assume we have a restriction on buffer size (because the most common usage for compression tools is reading from disk and writting it back the compressed image). Since we're working with all the image in memory and our output buffer is the memory itself, we can uncompress all the data at once, making it a much simpler and faster solution. But, to do that, we must compress the image in the same way, that's why this simple command line was written. Signed-off-by:
Eduardo Piva <efpiva@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Eduardo Piva authored
Added source code of fastlz from fastlz.org. The original library is written in C, but it was also C++ compatible, hence in this patch I added as a C++ source code and without preprocessor and extern "C" code. Signed-off-by:
Eduardo Piva <efpiva@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
_flags is currently being used to determine which page_ops will be assigned to the anonymous VMA, however, the variable flags received as a parameter is the one that must be used for that purpose. This problem was found while searching for the root of the general protection faults that were often happening when running our test suite. Bisect pointed to the commit 55693e5c The GPF was probably happening due to code that should only proceed with completely initialized anonymous VMAs. Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Reviewed-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Dmitry Fleytman authored
Current code injects vnc parameters into XL domain configuration file improperly thus broking run.py for Xen Signed-off-by:
Dmitry Fleytman <dmitry@daynix.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Feb 12, 2014
-
-
Pekka Enberg authored
Make it explicit that the use of common sense is required when applying the rules of the style guide. Also point out explicitly that in multiple variable declarations, you're supposed to "violate" an earlier rule. Reviewed-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Zhi Yong Wu authored
C++ coding convention is to bind '*' and '&' to the type, not to the variable. Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Zhi Yong Wu authored
Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Zhi Yong Wu authored
Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Zhi Yong Wu authored
Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
Zero is a regularly used value, so let's instead use a rare one to compare the content of the symbols against. Addressing some stylistic issues as well. Follow the new output: $ sudo scripts/run.py -e 'tests/tst-resolve.so' OSv v0.05-348-g8b39f8c Target value: 0x05050505 Success: nonexistant = 0x05050505 Success: debug = 0x05050505 Success: condvar_wait = 0x05050505 The time: 1392076964 success. Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
When multiple threads concurrently use a function which has a static variable with a constructor, gcc guarantees that only one thread will run the constructor, and the other ones waits. It uses __cxa_guard_acquire()/ release()/abort() for implementing this guarantee. Unfortunately these functions, implemented in the C++ standard library, use the Linux futex syscall which we don't implement. The futex system call is only used in the rare case of contention - in the normal case, the __cxa_guard_* functions uses just atomic memory operations. This patch implements the bare minimum we need from futex to make the __cxa_guard_* functions work: All we need is FUTEX_WAIT and FUTEX_WAKE, with 0 timeout in FUTEX_WAIT and wake all in FUTEX_WAKE. It's nice how the new waitqueues fit this implementation like glove to a hand: We could use condvars too, but we anyway need to do the wake() with the mutex locked (to get the futex's atomic test-and-wait), and in that case we can use waitqueues without their additional internal mutex. This patch also adds a test for this bug. Catching this bug in a real application is very rare, but the test exposes it by defining an function- static object with a very slow constructor (it sleeps for a second), and calls the function from several threads concurrently. Before this patch the test fails with: constructing syscall(): unimplemented system call 202. Aborting. Fixes #199. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Zhi Yong Wu authored
It can reduce the duplicated code Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Zhi Yong Wu authored
Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Zhi Yong Wu authored
Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Zhi Yong Wu authored
Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Zhi Yong Wu authored
Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Zhi Yong Wu authored
Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Dmitry Fleytman authored
Build failed on my Fedora 20 due to lack of maven package. Reviewed-by Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Dmitry Fleytman <dmitry@daynix.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Zhi Yong Wu authored
When control flow reaches at the bottom inner loop in namei(), the pointer p will point to either a '\0' or a '/' character because of the upper inner loop break condition: for (i = 0; i < PATH_MAX; i++) { if (*p == '\0' || *p == '/') { break; } name[i] = *p++; } So the "while" loop will never be executed and we can eliminate it as dead code. Reviewed-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
Add --sata option to use AHCI driver instead virtio-blk for QEMU. It makes no sense to use sata device instead of virtio-blk device. But this is mainly for test purpose. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
AHCI is supported on various VMM, e.g. Virtual Box, VMware Workstation. Adding AHCI support enables OSv to run on them if the para-virtualized block device is not present or not supported yet. Tested on VirtualBox, VMware Workstation and QEMU. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
Currently, only MSI-X is support in our PCI layer. Devices like AHCI controller support MSI interrupt only. This paves the way for AHCI driver. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
MSI support is supported in pci layer now. Enable MSI support interrupt_manager. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
The first BAR is not present does not mean the entire BAR are not present. Some implementation of AHCI controller only has BAR6 present with BAR1 to BAR5 empty. Keep probing if a BAR is not present. Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Or Cohen authored
Reviewed-by:
Tomasz Grabiec <tgrabiec@gmail.com> Signed-off-by:
Or Cohen <orc@fewbytes.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Feb 11, 2014
-
-
Nadav Har'El authored
This patch adds support for epoll()'s edge-triggered mode, EPOLLET. Fixes #188. As explained in issue #188, Boost's asio uses EPOLLET heavily, and we use that library in our management http server, and also in our image creation tool (cpiod.so). By ignoring EPOLLET, like we did until now, the code worked, but unnecessarily wasted CPU when epoll_wait() always returned immediately instead of waiting until a new event. This patch works within the confines of our existing poll mechanisms - where epoll() call poll(). We do not change this in this patch, and it should be changed in the future (see issue #17). In this patch we add to each struct file a field "poll_wake_count", which as its name suggests counts the number of poll_wake()s done on this file. Additionally, epoll remembers the last value it saw of this counter, so that in poll_scan(), if we see that an fp (polled with EPOLLET) has an unchanged counter from last time, we do not return readiness on this fp regardless on whether or not it has readable data. We have a complication with EPOLLET on sockets. These have an "SB_SEL" optimization, which avoids calling poll_wake() when it thinks the new data is not interesting because the old data was not yet consumed, and also avoids calling poll_wake() if fp->poll() was not previously done. This optimization is counter-productive for EPOLLET (and causes missed wakeups) so we need to work around it in the EPOLLET case. This patch also adds a test for the EPOLLET case in tst-epoll.cc. The test runs on both OSv and Linux, and can confirm that in the tested scenarios, Linux and OSv behave the same, including even one same false-positive: When epoll_wait() tells us there is data in a pipe, and we don't read it, but then more data comes on a pipe, epoll_wait() will again return a new event, despite this is not really being an edge event (the pipe didn't change from empty to not-empty, as it was previously not-empty as well). Concluding remarks: The primary goal of this implementation is to stop EPOLLET epoll_wait() from returning immediately despite nothing have happened on the file. That was what caused the 100% CPU use before this patch. That being said, the goal of this patch is NOT to avoid all false-positives or unnecessary wakeups; When events do occur on the file, we may be doing a bit more wakeups than strictly necessary. I think this is acceptable (our epoll() has worse problems) but for posterity, I want to explain: I already mentioned above one false-positive that also happens on Linux. Another false-positive wakeup that remains is in one of EPOLLET's classic use cases: Consider several threads sleeping on epoll() on the same socket (e.g., TCP listening socket, or UDP socket). When one packet arrives, normal level-triggered epoll() will wake all the threads, but only one will read the packet and the rest will find they have nothing to read. With edge- triggered epoll, only one thread should be woken and the rest would not. But in our implementation, poll_wake() wakes up *all* the pollers on this file, so we cannot currently support this optimization. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Vlad Zolotarov authored
Instead of binding all msix interrupts to cpu 0, have them chase the interrupt service routine thread and pin themselves to the same cpu. This patch is based on the patch from Avi Kivity <avi@cloudius-systems.com> and used some ideas of Nadav Har'El <nyh@cloudius-systems.com>. It improves the performance of the single thread Rx netperf test by 16%: before - 25694 Mbps after - 29875 Mbps New in V2: - Dropped the functor class - use lambda instead. - Fixed the race in a waking flow. - Added some comments. - Added the performance numbers to the patch description. Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
It appears that in GDB, (mmu::vma*)0 does not work, and one needs to enclose the type's name in single quotes: ('mmu::vma'*)0. This broke the vma_list function in scripts/loader.py, and caused an exception in "osv mmap" and other commands using the vma_list function. This patch adds the missing single-quotes. I don't understand how this code ever worked for anybody... I'm using gdb-7.6.1 from Fedora 19, if it matters. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Claudio Fontana authored
move the arch-specific stuff in premain to arch/x64/arch-setup.cc. Introduce arch_init_premain() and arch_setup_tls(). arch_init_premain() is supposed to perform arch-specific initialization before the common premain code is run. arch_setup_tls() is run _after_ the common setup_tls code. Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Path 1: poll() take file lock file::poll_install take socket lock Path 2: sowakep() (holding socket lock) so_wake_poll() take file lock Fix by running poll_install() outside the file lock (which isn't really needed). Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
Found the problem while running tst-resolve.so, follow the output: Success: nonexistant = 0 Failed: debug = -443987883 Failed: condvar_wait = -443987883 The time: 1392070630 2 failures. Bisect pointed to the commit 1dc81fe5. After understanding the actual purpose of the changes introduced by this commit, I figured out that program::lookup simply lacks a return when the target symbol is found from the underlying module. Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Feb 10, 2014
-
-
Dmitry Fleytman authored
There are 2 Xen block device backend implementations exist. First is a host kernel driver (xen_blkback in Linux) and another is anti-driver implemented in qemu-dm used by Xen HVM guests. xl toolset used by run.py selects implementation to use based of following rules (simplified to avoid non-relevant details): 1. block device specified as storage - use host kernel driver 2. file specified as storage - use QEMU anti-driver while Linux xen_blkback is highly optimized and supports all newly introduced features, QEMU implementation is rather simple and outdated. This patch forces xl to use xen_blkback driver by mapping OSv image file as a loop (block) device. Write speed and latency measurement results (image on RAM disk): +++ image file approach (before this patch) +++ misc-bdev-write.so: 109.434 Mb/s 107.822 Mb/s 106.684 Mb/s 102.080 Mb/s 111.211 Mb/s 117.465 Mb/s 107.311 Mb/s 115.834 Mb/s Wrote 1099.867 MB in 10.03 s = 109.689 Mb/s misc-bdev-wlatency.so: Min 50% 90% 99% 99.99% 99.999% Max [msec] --- --- --- --- ------ ------- --- 0.1000 0.1121 0.1079 0.1245 0.1309 0.2199 0.4412 +++ loop device approach (with this patch) +++ misc-bdev-write.so: OSv v0.05-193-g021dad4 444.600 Mb/s 579.262 Mb/s 547.984 Mb/s 615.998 Mb/s 519.428 Mb/s 535.732 Mb/s 471.388 Mb/s Wrote 5535.938 MB in 10.40 s = 532.126 Mb/s misc-bdev-wlatency.so: Min 50% 90% 99% 99.99% 99.999% Max [msec] --- --- --- --- ------ ------- --- 0.0304 0.0362 0.0341 0.0394 0.0487 0.0781 0.1331 Signed-off-by:
Dmitry Fleytman <dmitry@daynix.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Dmitry Fleytman authored
Useful for testing on RAM disks when writes are fast enough to fill the whole image in less than test execution time. Signed-off-by:
Dmitry Fleytman <dmitry@daynix.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
The scenario run elf file demand fault allocation leak detector tracking backtrace_safe() page fault leads to a nested exception. Add support for it by allocating an extra stack. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
The scenario run elf image demand page fault allocation leak detector tracking backtrace() access dwarf tables leads to a nested demand page fault, which we don't (and probably can't) support. Switch to backtrace_safe(), which is of lower quality, but is safer. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Tomasz Grabiec authored
In such case current context was not initialized. The fix is to default to master context. Reported-by:
Amnon Heiman <amnon@cloudius-systems.com> Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Ignore a dirty work tree, but not a wrong HEAD; this makes it impossible to update submodules. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Avi Kivity authored
Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-