- Sep 02, 2013
-
-
Pekka Enberg authored
Add simple tests for munmap() for file-backed memory maps. This exposes a limitation in munmap() not writing out MAP_SHARED mappings.
-
Pekka Enberg authored
Use the new mmu::msync() function to make sure file-backed memory maps are written out to disk in munmap().
-
Pekka Enberg authored
This adds simple msync() implementation for file-backed memory maps. It uses the newly added 'file_vma' data structure to write out and fsync the msync'd region as suggested by Avi Kivity.
-
Pekka Enberg authored
Add a new 'file_vma' class that extends 'vma'. This is needed to keep track of fileref and offset for file-backed VMAs for msync().
-
Avi Kivity authored
Spotted by Pekka.
-
Avi Kivity authored
Different source bases have different error conventions; libc has 0/-1+errno, while the rest os the source base uses 0/error. Wrap errors in a class to prevent confusion between the two.
-
- Sep 01, 2013
-
-
narkisr authored
-
narkisr authored
-
narkisr authored
-
narkisr authored
-
Ronen Narkis authored
-
Pekka Enberg authored
Fixes the following build brekage caused by commit a7d6b269 ("mman: Use libc_error() in mprotect()"): ../../libc/mman.cc: In function ‘int mprotect(void*, size_t, int)’: ../../libc/mman.cc:36:33: error: ‘libc_error’ was not declared in this scope return libc_error(EINVAL); ^ ../../libc/mman.cc:41:33: error: ‘libc_error’ was not declared in this scope return libc_error(ENOMEM); ^ CXX libc/pipe_buffer.o CXX libc/pipe.o make[1]: *** [libc/mman.o] Error 1
-
Pekka Enberg authored
-
- Aug 29, 2013
-
-
narkisr authored
-
Avi Kivity authored
-
Avi Kivity authored
This is used for temporarily dropping a lock in a lexical scope, and reacquiring it after an exit from the scope (similar to wait_until(mutex), but without the waiting): WITH_LOCK(preempt_lock) { // do some stuff while (not enough resources) { DROP_LOCK(preempt_lock) { acquire more resources } // reload anything that may have changed after DROP_LOCK() } // do more stuff with the acquired resources } Note that DROP_LOCK() doesn't work will with recursively-taken locks.
-
Avi Kivity authored
We don't want the compiler moving reads after a possible rcu_defer().
-
narkisr authored
-
narkisr authored
-
Pekka Enberg authored
-
Dor Laor authored
-
Dor Laor authored
-
Nadav Har'El authored
In the existing code, each -classpath or -jar paramter replaced the classpath. This is inconvenient (and unlike the Unix "java" program). Better just add to the classpath. For example, now we can run: java.so -cp /java/cli.jar -jar /java/web.jar app Which runs web.jar's main class, but adds both cli.jar and web.jar to the classpath.
-
- Aug 28, 2013
-
-
Glauber Costa authored
Xen has hard requirements on page transfers, and how to feed the grant tables. The address need to be page aligned, since the pfns and not addresses are used, and we need to provide at least a full page per buffer, since the hypervisor is free to fill any data within the page. To achieve that, the netfront driver will use m_cljget to attach an extended buffer to the mbuf, from the jumbop zone, since they are page-sized. However, two problems arise from this: 1) Allocating a page goes through malloc_large. Our implementation of malloc_large is currently terribly inefficient, and that creates a very heavy contention site. What I am doing with this patch is to switch our uma implementation to alloc_page / free_page instead of malloc if the caller of zcreate so specified (and then of course, specify it for the jumbop cache) 2) The refcount that is attached in the end of the buffer would either extend the buffer to 4100 bytes - defeating our purpose, or then the buffer would have to be PAGE_SIZE - 4, to accomodate for the refcount. But since the hypervisor will write to the whole page, it will eventually overwrite the refcount. To address that, I am allocating an external reference counter. BSD already have some infrastructure to do that, and I am taking advantage of this. However, I have found no way of implementing this in a way in which the reference count can be easily deduceable from the address of the extended buffer, without having the supporting mbuf to start from. Any external data structure such as hashes would probably make freeing way too slow. Thankfully, uma_find_refcnt and the UMA_ZONE_REFCNT seems to be used mostly in the setup/destruction phase (the mbuf refcount is used directly, open coded). So my proposal here is to remove the UMA_ZONE_REFCNT for that zone.
-
Glauber Costa authored
The x2APIC specification says that reading from the X2APIC_ID MSR should return the physical apic id of the current processor. However, the Xen implementation (as of 4.2.2) is broken, and reads actually return old style xAPIC id. Even if they fix it, we still have HVs deployed around that will return the wrong ID. We can work around this by testing if the returned APIC id is in the form (id << 24), since in that case, the first 24 bits will all be zeroed. Then at least we can get this working everywhere. This may pose a problem if we want to ever support more than 1 << 24 vCPUs (or if any other HV has some random x2apic ids), but that is highly unlikely anyway.
-
Glauber Costa authored
As I have described in a previous patch, the Xen hypervisor has a very nasty bug that causes all of the x2apic msr writes to trigger a GPF. Although the request proceeds fine despite the GPF, it does bring a problem for all-but-self style init sequences we are using: after "failing" (succeeding but returning failure) to deliver the interrupt for the first cpu in the group, xen will break the loop, therefore not delivering the SIPIs to other cpus in the system at all. We can work around that by delivering interrupts to each cpu individually, instead of all-but-self.
-
Glauber Costa authored
Unfortunately, the Xen hypervisor has a very nasty bug (seems to be fixed by a 2013 patch - which means that although it is fixed, a lot of hypervisors will have it), that causes all of the x2apic msr writes to init related registers (INIT, SIPI, etc) trigger a GPF. The way to work around this, is to implement a form of "wrmsr_safe".
-
Glauber Costa authored
I ended up forgetting to remove some kprintfs from device.c that were inserted during Xen's blkfront development
-
Pekka Enberg authored
Now that we can walk through the vma list, add mmap numbers to 'osv mem': (gdb) osv mem Total Memory: 4294564864 Bytes Mmap Memory: 3278278656 Bytes (76.34%) Free Memory: 474492928 Bytes (11.05%)
-
Pekka Enberg authored
-
- Aug 27, 2013
-
-
Nadav Har'El authored
Commit 65afd075 fixed mincore() to recognize unmapped addresses. However, it used mmu::ismapped() which just checks for mmap()'ed addresses, and doesn't know about malloc()ed memory. This causes trouble for libunwind (which we use for backtrace()) which tests mincore() on an on-stack variable, and for non-pthread threads, this stack might be malloc'ed, not mmap'ed. So this patch adds mmu::isreadable(), which checks that a given memory range is all readable (this memory can be mmapped, malloced, stack, whatever). mincore() now uses that. mmu::isreadable() is implemented, following Avi's idea, by trying to read, with safe_load(), one byte from every page in the range. This approach is faster than page-table-walking especially for one-byte checks (which all libunwind uses anyway), and also very simple.
-
Nadav Har'El authored
Unlike msync(), mincore() should also work on non-mmapped memory, such as stack and malloc()ed memory. Currently it doesn't - it fails on malloc()ed memory and only sometimes works on stacks (works on pthread stacks which are mmapped, but not on sched::thread stacks which are malloced by default). This patch adds a test to tst-mmap.cc to demonstrate this problem. The test currently fails, will be fixed in a follow-up patch.
-
Glauber Costa authored
Most of the performance problems I have found on Xen were due to the fact that we were hitting malloc_large consistently, for allocations that we should be able to service in some other way. Because malloc_large in our implementation is such a bottleneck, it was very useful for me to have separate tracepoints for them. I am then proposing for inclusion.
-
Nadav Har'El authored
Commit 65afd075 that fixed mincore() exposed a deadlock in the leak detector, caused by two threads taking two locks in opposite order: Thread 1: malloc() does alloc_tracker::remember(). This takes the tracker lock and calls backtrace() calling mincore() which takes the vma_list_mutex. Thread 2: mmap() does mmu::allocate() which takes the vma_list_mutex and then through mmu::populate::small_page calls memory::alloc_page() which calls alloc_tracker::remember() and takes the tracker lock. This patch fixes this deadlock: alloc_tracker::remember() will now drop its lock while running backtrace(), as the lock is only needed to protect the allocations[] array. We need to retake the lock after backtrace() completes, to copy the backtrace back to the allocations[] array. Previously, the lock's depth was also (ab)used for avoiding nested allocation tracking (e.g., tracking of memory allocation done inside backtrace() itself), but now that backtrace() is run without the lock, we need a different mechanism - a per-thread "in_tracker" flag, which is turned on inside the alloc_tracker::remember()/forget() methods.
-
Glauber Costa authored
This allows lazy people like me to just copy the instructions
-
Glauber Costa authored
We can't trust the state of the FPU and the CSR registers to be always sane. Apparently, they aren't on at least one version of Xen (which happens to be the one I am using) Initialize it manually for all CPUs on bringup.
-
Glauber Costa authored
In the xen interrupt code, I have made the mistake of exchanging the previous value of _irq_pending with true, which means that we were constantly polling for data in the interrupt threads. This was responsible for the latency spikes I was seeing. The simple "ping" test still shows bad results in absolute terms, but at least now the spikes are gone.
-
- Aug 26, 2013
-
-
Nadav Har'El authored
sched.hh included elf.hh, just so it can refer to the elf::tls_data type. But now that we have rcu.hh which includes sched.hh and therefore elf.hh, if we wish to use rcu in elf.hh (we'll do this in a later patch), we have an include loop mess. So better not include elf.hh from sched.hh, and just declare the one struct we need. After sched.hh no longer includes elf.hh and the dozen includes that it further included, we need to add missing includes to some of the code that included sched.hh and relied on its implict includes.
-
Avi Kivity authored
A signal within a signal handler is really bad news, abort when it happens to let the developers debug it.
-
Avi Kivity authored
Trying to execute the null pointer, or faults within the kernel code, are a really bad sign and it's better to abort early with them.
-