- Nov 26, 2013
-
-
Avi Kivity authored
Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
We previously had the POSIX variant only. Implement the GNU variant as well, and update the header to point to the correct function based on the dialect selected. The POSIX variant is renamed __xpg_strerror_r() to conform to the ABI standards. This fixes calls to strerror_r() from binaries which were compiled with _GNU_SOURCE (libboost_system.a) but preserves the correct behaviour for BSD derived source. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Some functions (strerror_r()) are defined differently based on the source dialect. We need to provide both dialects since we have mixed source. Add a source-dialect macro (defaulting to _GNU_SOURCE) and override it as appropriate. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
Started adding Doxygen documentation for the scheduler. Currently only set_priority() and priority() are documented. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Nadav Har'El authored
This patch replaces the algorithm which the scheduler uses to keep track of threads' runtime, and to choose which thread to run next and for how long. The previous algorithm used the raw cumulative runtime of a thread as its runtime measure. But comparing these numbers directly was impossible: e.g., should a thread that slept for an hour now get an hour of uninterrupted CPU time? This resulted in a hodgepodge of heuristics which "modified" and "fixed" the runtime. These heuristics did work quite well in our test cases, but we were forced to add more and more unjustified heuristics and constants to fix scheduling bugs as they were discovered. The existing scheduler was especially problematic with thread migration (moving a thread from one CPU to another) as the runtime measure on one CPU was meaningless in another. This bug, if not corrected, (e.g., by the patch which I sent a month ago) can cause crucial threads to acquire exceedingly high runtimes by mistake, and resulted in the tst-loadbalance test using only one CPU on a two-CPU guest. The new scheduling algorithm follows a much more rigorous design, proposed by Avi Kivity in: https://docs.google.com/document/d/1W7KCxOxP-1Fy5EyF2lbJGE2WuKmu5v0suYqoHas1jRM/edit?usp=sharing To make a long story short (read the document if you want all the details), the new algorithm is based on a runtime measure R which is the running decaying average of the thread's running time. It is a decaying average in the sense that the thread's act of running or sleeping in recent history is given more weight than its behavior a long time ago. This measure R can tell us which of the runnable threads to run next (the one with the lowest R), and using some highschool-level mathematics, we can calculate for how long to run this thread until it should be preempted by the next one. R carries the same meaning on all CPUs, so CPU migration becomes trivial. The actual implementation uses a normalized version of R, called R'' (Rtt in the code), which is also explained in detail in the document. This Rtt allows updating just the running thread's runtime - not all threads' runtime - as time passes, making the whole calculation much more tractable. The benefits of the new scheduler code over the existing one are: 1. A more rigourous design with fewer unjustified heuristics. 2. A thread's runtime measurement correctly survives a migration to a different CPU, unlike the existing code (which sometimes botches it up, leading to threads hanging). In particular, tst-loadbalance now gives good results for the "intermittent thread" test, unlike the previous code which in 50% of the runs caused one CPU to be completely wasted (when the load- balancing thread hung). 3. The new algorithm can look at a much longer runtime history than the previous algorithm did. With the default tau=200ms, the one-cpu intermittent thread test of tst-scheduler now provides good fairness for sleep durations of 1ms-32ms. The previous algorithm was never fair in any of those tests. 4. The new algorithm is more deterministic in its use of timers (with thyst=2_ms: up to 500 timers a second), resulting in less varied performance in high-context-switch benchmarks like tst-ctxsw. This scheduler does very well on the fairness tests tst-scheduler and fairly well on tst-loadbalance. Even better performance on that second test will require an additional patch for the idle thread to wake other cpus' load balanacing threads. As expected the new scheduler is somewhat slower than the existing one (as we now do some relatively complex calculations instead of trivial integer operations), but thanks to using approximations when possible and to various other optimizations, the difference is relatively small: On my laptop, tst-ctxsw.so, which measures "context switch" time (actually, also including the time to use mutex and condvar which this test uses to cause context switching), on the "colocated" test I measured 355 ns with the old scheduler, and 382 ns with the new scheduler - meaning that the new scheduler adds 27ns of overhead to every context switch. To see that this penalty is minor, consider that tst-ctxsw is an extreme example, doing 3 million context switches a second, and even there it only slows down the workload by 7%. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Nadav Har'El authored
The schedule() and cpu::schedule() functions had a "yield" parameter. This parameter was inconsistently used (it's not clear why specific places called it with "true" and other with "false"), but moreover, was always ignored! So this patch removes the parameter of schedule(). If you really want a yield, call yield(), not schedule(). Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Nadav Har'El authored
The idle thread cpu::idle() waits for other threads to become runnable, and then lets them run. It used to yield the CPU by calling yield(), because in early OSv history we didn't have an idle priority so simply calling schedule() would not guarantee that the new thread, not the idle thread, will run. But now we actually do have an idle priority; If the run queue is not empty, we are sure that calling schedule() will run another thread, not the idle thread. So this patch calls schedule(), which is simpler, faster, and more reliable than yield(). Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Nadav Har'El authored
The scheduler (reschedule_from_interrupt()) changes the runtime of the current thread. This assumes that the current thread is not in the runqueue - because the runqueue is sorted by runtime, and modifying the runtime of a thread which is already in the runqueue ruins the sorted tree's invariants. Unfortunately, the existing code broke this assumption in two places: 1. When handle_incoming_wakeups() wakes up the current thread (i.e., a thread that prepared to wait but was woken before it could go to sleep), the current thread was queued. We need to instead to simply return the thread to the "running" state. 2. yield() queued the current thread. Rather, it needs to just change its runtime, and reschedule_from_interrupt() will decide to queue this thread. This patch fixes the first problem. The second problem will be solved by a yield() rewrite which is part of the new scheduler in a later patch. By the way, after we fix both problems, we can also be sure that the strange if(n != thread::current()) in the scheduler is always true. This is because n, picked up from the run queue, could never be the current thread, because the current thread isn't in the run queue. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Pekka Enberg authored
Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
v2: Let's convert everything to std::chrono::timepoint (Avi Kivity) v3: Use the to_timeptr approach suggested by Nadav Har'El This test checks the functionality of the utimes support. Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
v2: Check limit of microseconds, among other minor changes (Nadav Har'El, Avi Kivity). v3: Get rid of goto & label by adding an else clause (Nadav Har'El). - This patch adds utimes support. - This patch addresses the issue #93 Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Tested-by:
Tomasz Grabiec <tgrabiec@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
Attribute flags were moved from 'bsd/sys/cddl/compat/opensolaris/sys/vnode.h' to 'include/osv/vnode_attr.h' 'bsd/sys/cddl/compat/opensolaris/sys/vnode.h' now includes 'include/osv/vnode_attr.h' exactly at the place the flags were previously located. 'fs/vfs/vfs.h' includes 'include/osv/vnode_attr.h' as functions that rely on the setattr feature must specify the flags respective to the attr fields that are going to be changed. Approach sugested by Nadav Har'El Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Tested-by:
Tomasz Grabiec <tgrabiec@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
Use vop_eperm instead to warn the caller about the lack of support (Glauber Costa). Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Tested-by:
Tomasz Grabiec <tgrabiec@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Tested-by:
Tomasz Grabiec <tgrabiec@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
This patch causes incorrect usage of percpu<>/PERCPU() to cause compilation errors instead of silent runtime corruptions. Thanks to Dmitry for first noticing this issue in xen_intr.cc (see his separate patch), and to Avi for suggesting a compile-time fix. With this patch: 1. Using percpu<...> to *define* a per-cpu variable fails compilation. Instead, PERCPU(...) must be used for the definition, which is important because it places the variable in the ".percpu" section. 2. If a *declaration* is needed additionally (e.g., for a static class member), percpu<...> must be used, not PERCPU(). Trying to use PERCPU() for declaration will cause a compilation error. 3. PERCPU() only works on statically-constructed objects - global variables, static function-variables and static class-members. Trying to use it on a dynamically-constructed object - stack variable, class field, or operator new - will cause a compilation error. With this patch, the bug in xen_intr.cc would have been caught at compile time. Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Dmitry Fleytman authored
Bug fixed by this patch made OSv crash on Xen during boot. The problem started to show up after commit: commit ed808267 Author: Nadav Har'El <nyh@cloudius-systems.com> Date: Mon Nov 18 23:01:09 2013 +0200 percpu: Reduce size of .percpu section Signed-off-by:
Dmitry Fleytman <dmitry@daynix.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Nov 25, 2013
-
-
Dmitry Fleytman authored
This feature will be used to release images with preinstalled applications. Signed-off-by:
Dmitry Fleytman <dmitry@daynix.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Amnon Heiman authored
Start up shell and management web in parallel to make boot faster. Note that we also switch to latest mgmt.git which decouples JRuby and CRaSH startup. Signed-off-by:
Amnon Heiman <amnon@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Amnon Heiman authored
When using the MultiJarLoader as the main class, it will use a configuration file for the java loading. Each line in the file will be used to start a main, you can use -jar in each line or specify a main class. Signed-off-by:
Amnon Heiman <amnon@cloudius-systems.com> Reviewed-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
As suggested by Nadav, add tests for mincore() interraction with demand paging. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
This adds a simple mmap microbenchmark that can be run on both OSv and Linux. The benchmark mmaps memory for various sizes and touches the mmap'd memory in 4K increments to fault in memory. The benchmark also repeats the same tests using MAP_POPULATE for reference. OSv page faults are slightly slower than Linux on first iteration but faster on subsequent iterations after host operating system has faulted in memory for the guest. I've included full numbers on 2-core Sandy Bridge i7 for a OSv guest, Linux guest, and Linux host below: OSv guest --------- Iteration 1 time (seconds) MiB demand populate 1 0.004 0.000 2 0.000 0.000 4 0.000 0.000 8 0.001 0.000 16 0.003 0.000 32 0.007 0.000 64 0.013 0.000 128 0.024 0.000 256 0.052 0.001 512 0.229 0.002 1024 0.587 0.005 Iteration 2 time (seconds) MiB demand populate 1 0.001 0.000 2 0.000 0.000 4 0.000 0.000 8 0.001 0.000 16 0.002 0.000 32 0.004 0.000 64 0.010 0.000 128 0.019 0.001 256 0.036 0.001 512 0.069 0.002 1024 0.137 0.005 Iteration 3 time (seconds) MiB demand populate 1 0.001 0.000 2 0.000 0.000 4 0.000 0.000 8 0.001 0.000 16 0.002 0.000 32 0.005 0.000 64 0.010 0.000 128 0.020 0.000 256 0.039 0.001 512 0.087 0.002 1024 0.138 0.005 Iteration 4 time (seconds) MiB demand populate 1 0.001 0.000 2 0.000 0.000 4 0.000 0.000 8 0.001 0.000 16 0.002 0.000 32 0.004 0.000 64 0.012 0.000 128 0.025 0.001 256 0.040 0.001 512 0.082 0.002 1024 0.138 0.005 Iteration 5 time (seconds) MiB demand populate 1 0.001 0.000 2 0.000 0.000 4 0.000 0.000 8 0.001 0.000 16 0.002 0.000 32 0.004 0.000 64 0.012 0.000 128 0.028 0.001 256 0.040 0.001 512 0.082 0.002 1024 0.166 0.005 Linux guest ----------- Iteration 1 time (seconds) MiB demand populate 1 0.001 0.000 2 0.001 0.000 4 0.002 0.000 8 0.003 0.000 16 0.005 0.000 32 0.008 0.000 64 0.015 0.000 128 0.151 0.001 256 0.090 0.001 512 0.266 0.003 1024 0.401 0.006 Iteration 2 time (seconds) MiB demand populate 1 0.000 0.000 2 0.000 0.000 4 0.001 0.000 8 0.001 0.000 16 0.002 0.000 32 0.005 0.000 64 0.009 0.000 128 0.019 0.001 256 0.037 0.001 512 0.072 0.003 1024 0.144 0.006 Iteration 3 time (seconds) MiB demand populate 1 0.000 0.000 2 0.001 0.000 4 0.001 0.000 8 0.001 0.000 16 0.002 0.000 32 0.005 0.000 64 0.010 0.000 128 0.019 0.001 256 0.037 0.001 512 0.072 0.003 1024 0.143 0.006 Iteration 4 time (seconds) MiB demand populate 1 0.000 0.000 2 0.001 0.000 4 0.001 0.000 8 0.001 0.000 16 0.003 0.000 32 0.005 0.000 64 0.010 0.000 128 0.020 0.001 256 0.038 0.001 512 0.073 0.003 1024 0.143 0.006 Iteration 5 time (seconds) MiB demand populate 1 0.000 0.000 2 0.001 0.000 4 0.001 0.000 8 0.001 0.000 16 0.003 0.000 32 0.005 0.000 64 0.010 0.000 128 0.020 0.001 256 0.037 0.001 512 0.072 0.003 1024 0.144 0.006 Linux host ---------- Iteration 1 time (seconds) MiB demand populate 1 0.000 0.000 2 0.001 0.000 4 0.001 0.000 8 0.001 0.000 16 0.002 0.000 32 0.005 0.000 64 0.009 0.000 128 0.019 0.001 256 0.035 0.001 512 0.152 0.003 1024 0.286 0.011 Iteration 2 time (seconds) MiB demand populate 1 0.000 0.000 2 0.000 0.000 4 0.001 0.000 8 0.001 0.000 16 0.002 0.000 32 0.004 0.000 64 0.010 0.000 128 0.018 0.001 256 0.035 0.001 512 0.192 0.003 1024 0.334 0.011 Iteration 3 time (seconds) MiB demand populate 1 0.000 0.000 2 0.000 0.000 4 0.001 0.000 8 0.001 0.000 16 0.002 0.000 32 0.004 0.000 64 0.010 0.000 128 0.018 0.001 256 0.035 0.001 512 0.194 0.003 1024 0.329 0.011 Iteration 4 time (seconds) MiB demand populate 1 0.000 0.000 2 0.000 0.000 4 0.001 0.000 8 0.001 0.000 16 0.002 0.000 32 0.004 0.000 64 0.010 0.000 128 0.018 0.001 256 0.036 0.001 512 0.138 0.003 1024 0.341 0.011 Iteration 5 time (seconds) MiB demand populate 1 0.000 0.000 2 0.000 0.000 4 0.001 0.000 8 0.001 0.000 16 0.002 0.000 32 0.004 0.000 64 0.010 0.000 128 0.018 0.001 256 0.035 0.001 512 0.135 0.002 1024 0.324 0.011 Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Switch to demand paging for anonymous virtual memory. I used SPECjvm2008 to verify performance impact. The numbers are mostly the same with few exceptions, most visible in the 'serial' benchmark. However, there's quite a lot of variance between SPECjvm2008 runs so I wouldn't read too much into them. As we need the demand paging mechanism and the performance numbers suggest that the implementation is reasonable, I'd merge the patch as-is and see optimize it later. Before: Running specJVM2008 benchmarks on an OSV guest. Score on compiler.compiler: 331.23 ops/m Score on compiler.sunflow: 131.87 ops/m Score on compress: 118.33 ops/m Score on crypto.aes: 41.34 ops/m Score on crypto.rsa: 204.12 ops/m Score on crypto.signverify: 196.49 ops/m Score on derby: 170.12 ops/m Score on mpegaudio: 70.37 ops/m Score on scimark.fft.large: 36.68 ops/m Score on scimark.lu.large: 13.43 ops/m Score on scimark.sor.large: 22.29 ops/m Score on scimark.sparse.large: 29.35 ops/m Score on scimark.fft.small: 195.19 ops/m Score on scimark.lu.small: 233.95 ops/m Score on scimark.sor.small: 90.86 ops/m Score on scimark.sparse.small: 64.11 ops/m Score on scimark.monte_carlo: 145.44 ops/m Score on serial: 94.95 ops/m Score on sunflow: 73.24 ops/m Score on xml.transform: 207.82 ops/m Score on xml.validation: 343.59 ops/m After: Score on compiler.compiler: 346.78 ops/m Score on compiler.sunflow: 132.58 ops/m Score on compress: 116.05 ops/m Score on crypto.aes: 40.26 ops/m Score on crypto.rsa: 206.67 ops/m Score on crypto.signverify: 194.47 ops/m Score on derby: 175.22 ops/m Score on mpegaudio: 76.18 ops/m Score on scimark.fft.large: 34.34 ops/m Score on scimark.lu.large: 15.00 ops/m Score on scimark.sor.large: 24.80 ops/m Score on scimark.sparse.large: 33.10 ops/m Score on scimark.fft.small: 168.67 ops/m Score on scimark.lu.small: 236.14 ops/m Score on scimark.sor.small: 110.77 ops/m Score on scimark.sparse.small: 121.29 ops/m Score on scimark.monte_carlo: 146.03 ops/m Score on serial: 87.03 ops/m Score on sunflow: 77.33 ops/m Score on xml.transform: 205.73 ops/m Score on xml.validation: 351.97 ops/m Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Use optimistic locking in populate() to make it robust against concurrent page faults. Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Add permission flags to VMAs. They will be used by mprotect() and the page fault handler. Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Tomasz Grabiec authored
Duration analysis is based on trace pairs which follow the convention in which function entry generates trace named X and ends with either trace X_ret or X_err. Traces which do not have an accompanying return tracepoint are ignored. New commands: osv trace summary Prints execution time statistics for traces osv trace duration {function} Prints timed traces sorted by duration in descending order. Optionally narrowed down to a specified function gdb$ osv trace summary Execution times [ms]: name count min 50% 90% 99% 99.9% max total vfs_pwritev 3 0.682 1.042 1.078 1.078 1.078 1.078 2.801 vfs_pwrite 32 0.006 1.986 3.313 6.816 6.816 6.816 53.007 gdb$ osv trace duration 0xffffc000671f0010 1 1385318632.103374 6.816 vfs_pwrite 0xffffc0003bbef010 0 1385318637.929424 3.923 vfs_pwrite Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Tomasz Grabiec authored
Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Tomasz Grabiec authored
The iteration logic was duplicated in two places. The patches yet to come would add yet another place, so let's refactor first. Signed-off-by:
Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
Calling feof on a closed file isn't safe, and the result is undefined. Found while auditing the code. Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
We iterate over the timer list using an iterator, but the timer list can change during iteration due to timers being re-inserted. Switch to just looking at the head of the list instead, maintaining no state across loop iterations. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Tested-by:
Pekka Enberg <penberg@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
When a hardware timer fires, we walk over the timer list, expiring timers and erasing them from the list. This is all well and good, except that a timer may rearm itself in its callback (this only holds for timer_base clients, not sched::timer, which consumes its own callback). If it does, we end up erasing it even though it wants to be triggered. Fix by checking for the armed state before erasing. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Tested-by:
Pekka Enberg <penberg@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
When a condvar's timeout and wakeup race, we wait for the concurrent wakeup to complete, so it won't crash. We did this wr.wait() with the condvar's internal mutex (m) locked, which was fine when this code was written; But now that we have wait morphing, wr.wait() waits not just for the wakeup to complete, but also for the user_mutex to become available. With m locked and us waiting for user_mutex, we're now in deadlock territory - because a common idiom of using a condvar is to do the locks in opposite order: lock user_mutex first and then use the condvar, which locks m. I can't think of an easy way to actually demonstrate this deadlock, short of having a locked condvar_wait timeout racing with condvar_wake_one racing and then an additional locked condvar operation coming in concurrently, but I don't have a test case demonstrating this. I am hoping it will fix the lockups that Pekka is seeing in his Cassandra tests (which are the reason I looked for possible condvar deadlocks in the first place). Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Tested-by:
Pekka Enberg <penberg@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Glauber Costa authored
The problem with sleep, is that we can initialize early threads before the cpu itself is initialized. If we note what goes on in init_on_cpu, it should become clear: void cpu::init_on_cpu() { arch.init_on_cpu(); clock_event->setup_on_cpu(); } When we finally initialize the clock_event, it can get lost if we already have pending timers of any kind - which we may, if we have early threads being start()ed before that. I have played with many potential solutions, but in the end, I think the most sensible thing to do is to delay initialization of early threads to the point when we are first idle. That is the best way to guarantee that everything will be properly initialized and running. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
- Nov 22, 2013
-
-
ufokaradagli@gmail.com authored
Fixed a couple of spelling mistakes in README.md Signed-off-by:
Omer Karadagli <ufokaradagli@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
To prevent leaks when a file is close()d without an EPOLL_CTL_DEL, record epoll registrations in the file structure and remove them when the file is destroyed. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Avoid possible blocking. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Make sure to wait until the running thread count drops to zero before destroying things. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Since it's initialized with the constructor, the mutex is already initialized. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
use file::operator delete to ensure it is reclaimed via rcu, and let the rest of the cleanup happen via the destructor. This allows us to add other members to file, and let the standard construction/destruction sequence take place. Note the constructor is already used (falloc_noinstall()). Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Holding filerefs causes close() to be delayed indefinitly in case the user "forgets" to EPOLL_CTL_DEL the file before close(). Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-