- Dec 30, 2013
-
-
Tomasz Grabiec authored
This was the cause of poor ZFS performance in misc-fs-stress test. Before: Wrote 168.129 MB in 10.12 s = 16.610 Mb/s Wrote 194.688 MB in 10.00 s = 19.469 Mb/s Wrote 183.004 MB in 10.06 s = 18.186 Mb/s Wrote 167.754 MB in 10.28 s = 16.315 Mb/s After: Wrote 636.227 MB in 10.00 s = 63.623 Mb/s Wrote 666.979 MB in 10.00 s = 66.696 Mb/s Wrote 613.512 MB in 10.00 s = 61.350 Mb/s Wrote 573.502 MB in 10.00 s = 57.346 Mb/s Wrote 668.607 MB in 10.00 s = 66.857 Mb/s Wrote 630.920 MB in 10.00 s = 63.087 Mb/s It turned out that the limiting factor was the ARC cache. A check inside arc_tempreserve_space() was forcing txg to be synced too often (once every 400ms). The arc_c variable was only 16M (arc_c_min) which allowed to write only 8M per transaction. It turns out that arc_c depends on kmem_size() which is based on physmem which was never initialized. I would hold with commiting this yet because of several reasons, which I want to put under your consideration. While this improves write throughput it makes the boot time after make much longer, on my disk the boot time is increased from 1.5s to 10s. This is because zfs verifies the last 3 txgs upon mount. This patch increases txg size, which results in more data to check in the next boot. I'm working on solving this right now. Something worth noting is that while larger transactions sync less often incresing throughput they also sync longer increasing worst case latency. In my test the pauses get as high as 3 seconds with 1G of guest memory. Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
The current initialization is to something like 5 bit words, which truncates everything to control characters on VMware. QEMU somehow ignores the word size. Fix to initialize the serial port to 8N1, yielding useful output. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Gleb Natapov authored
Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Gleb Natapov authored
mprotect(PROT_WRITE) on a file opened as read only should fail, but current mprotect() implementation is missing the check. The patch implements it. Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Gleb Natapov authored
If mmap size is greater than large page it's better to align it to a large page size since this will ensure optimal coverage of the region by large pages. Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Or Cohen authored
getgrgid_r(3) is needed when querying file attributes from Java (see java.nio.file.Files.readAttributes()). This is needed for long format (-l) flag of ls. getgrgid_r also requires sysconf(_SC_GETGR_R_SIZE_MAX) Signed-off-by:
Or Cohen <orc@fewbytes.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Fail build immediately if compiler does not support x86-64. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
OSv can only be built on a 64-bit Linux distribution - otherwise the compiler will not (by default, at least) generate 64-bit code, and we won't be able to run the 64-guest needed for building the ZFS filesystem. Make this requirement explicit in README.md. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
While (unfortunately) C++ doesn't support designated initializers, and the compiler rejects them, one instance has survived in xenbus. Strangely, gcc 4.8.2 generates correct code, while gcc 4.8.0 fails with an internal compiler error, instead of both of them rejecting the code. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Without this, only even megabytes are accessible, and accesses to odd megabytes hit the even megabytes. For example address 0x312345 is aliased to address 0x212345. Enable the A20 gate to prevent this. Fixes boot on VMware. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 27, 2013
-
-
Pekka Enberg authored
Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Tomasz Grabiec authored
Now that total is long casting to float may cause precision loss. Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
As pointed out by Tomek, the call to printStackTrace() will already print out the exception message. Drop the "uncaught exception" messages to simplify things. Acked-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
This is useful to notify us that something goes wrong. Reviewed-by:
Tomasz Grabiec <tgrabiec@gmail.com> Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
bytes_written can overflow while testing on a fast device. Reviewed-by:
Tomasz Grabiec <tgrabiec@gmail.com> Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Start using tprintf as in drivers/pci.cc, for example, and switch to 'error' severity to disable early boot ACPI messages. Eventually, core/debug.cc needs to be more dynamically configured but for now it's good enough to use a standardized API and make the output silent by default. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Amnon Heiman authored
The uncaught exception handler did not state what the exception is. Signed-off-by:
Amnon Heiman <amnon@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Tomasz Grabiec authored
Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Vlad Zolotarov authored
tst-commands.so and tst-fsx.so were missing. Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Vlad Zolotarov authored
Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Vlad Zolotarov authored
Add additional error patterns to test.py: 1) For tests that are missing. Without it test.py was reporting the missing tests as PASSED. 2) For tests that fail with a critical errors like missing symbols. Without it the "make check" was hanging silently leaving u dazed and confused. 3) For tests that return non-zero status: a) Those that use exit(). b) Those that user return xx. Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Vlad Zolotarov authored
Signed-off-by:
Vlad Zolotarov <vladz@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Indicate to the virtual hardware that our stack supports UDP fragmentation offload. This improves performance by a factor of about 3.3 (from ~20Gbps to ~66Gbps) running the netperf UDP STREAM test. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
We reported the size of the last buffer in the packet, rather than the size of the complete packet. Fix to report the total size. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Asias He authored
With indirect descriptor, we can queue more buffers in the queue. Indirect descriptor helps block device by making the large request does not consume the entire ring and making the queue depth deeper. Indirect descriptor does not help net device because it makes the queue longer so it adds latency. The tests show that indirect descriptor makes blk faster and there is no real measurable degradation on net. Also the indirect will turn on only when we are short of descriptors. This patch only enables indirect descriptor for vblk and vscsi. vnet is not enabled. 1) vblk Before: 340MB/s After: 350MB/s 2) vscsi Before: 320MB/s After: 410MB/s Signed-off-by:
Asias He <asias@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Dec 26, 2013
-
-
Asias He authored
VIRTIO_RING_F_INDIRECT_DESC belongs to the base features bits. No need to specify it in net driver. Signed-off-by:
Asias He <asias@cloudius-systems.com> Reviewed-by:
Dor Laor <dor@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Gleb Natapov authored
operate_range() has it already. Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Gleb Natapov authored
If permissions are elevated another cpu will fault and will see new permission after page walk. Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Gleb Natapov authored
Page cannot be freed before remote tlbs are flushed since if remote cpu has the page in its tlb and the page is reallocated for some other purposes remote cpu can still access the page through tlb and corrupt its content. Think about two threads running on two different cpus: first one writes to a virtual address constantly and second unmaps the virtual address. Physical page, virtual address is mapped to, cannot be freed before both cpus tlb are flushed. Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Gleb Natapov authored
Add constexpr to make sure they are evaluated in compile time if possible. Compiler will probably do it anyway though. Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Gleb Natapov authored
Implement page_mapper variant for virt_to_phys mapping. map_level class now hold a reference to page_mapper since page_mapper state needs to be preserved over function calls. Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Gleb Natapov authored
This is pretty straightforward: provide page_mapper variant for each of those operations, remove unused pt walker. Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Gleb Natapov authored
This patch implements generic page table walker that traverse page table levels in compile time. It accepts page_mapper class, that controls various aspects of page traversing, as a parameter. page_table_operation controls whether non present intermediate page should be allocated, how to handle leaf small/huge pages, whether to split huge pages, how to handle sub area of a huge page in case splitting is disabled and whether walker should loop over multiple page entries. linear_map_level() is modified to use new page walker. Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Gleb Natapov authored
Move code that will be needed by unified page walker. No changes to generated code. Signed-off-by:
Gleb Natapov <gleb@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Avi Kivity authored
Backported from FreeBSD r242252. Improves netperf by about 10%. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Avi Kivity authored
This is more useful if there is no ordering between the two numbers (either one can be ahead). Change BYTES_THIS_ACK to return unsigned, to prevent an unsigned division from turning into a signed division. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Avi Kivity authored
Add comparison operators that use modulo arithmetic to order sequence numbers, and use them to replace SEQ_LT() and friends, increasing code readability. As a consequence std::min() and std::max() can be used instead of SEQ_MIN() and SEQ_MAX(). Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Avi Kivity authored
tcp sequence numbers are similar to integers, but have different comparison operations. Separate them into a class so we don't mix the two. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Avi Kivity authored
inline functions can be overloaded and are less nasty than macros in other ways (like evaluating their arguments only once). Note we can't touch ntohl() itself, since it is defined to be an out-of-line function by libc. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-