summaryrefslogtreecommitdiff
path: root/lib/libevent/Makefile
diff options
context:
space:
mode:
authorStefan Fritsch <sf@cvs.openbsd.org>2014-10-06 20:34:59 +0000
committerStefan Fritsch <sf@cvs.openbsd.org>2014-10-06 20:34:59 +0000
commitd41036be6ee3c813374da5b0b78ba5df2ff92c97 (patch)
treebe03285a14d85a20c60dd26ddc2a8d2f3e582ea4 /lib/libevent/Makefile
parent663dd3a733d1e1eb57165f93e9999a75de283023 (diff)
Make amd64 pmap more efficient on multi-processor
With the current implementation, when accessing an inactive pmap, its ptes are mapped in the APTE range. This has the problem that the APTE range is mapped on all CPUs and changes to the APTE must therefore be followed by a remote TLB flush on all CPUs. This is very inefficient because the costs increase quadratically with the number of CPUs. Therefore, the code is changed to remove the APTE mechanism completely and instead switch the pmap locally. A remote TLB flush is then only done if the pmap is in use on the remote CPU. In the common case, this will replace one TLB flush on all CPUs with two local TLB flushes. An additional optimization is done in cases where only a single PTE of an inactive pmap is accessed: The requested PTE is found by walking the page tables manually via the direct mapping. This makes some more TLB flushes unnecessary. Furthermore, some code is reordered so that the TLB-shootdown-IPIs are sent first, then more local processing takes place, and only afterwards the CPU waits for the remote TLB-shootdowns to finish. This diff is based on a patch for i386 by Artur Grabowski <art blahonga org> from 2008. Some additional bits were taken from a different patch by Artur from 2005. Tested by many. OK mlarkin@
Diffstat (limited to 'lib/libevent/Makefile')
0 files changed, 0 insertions, 0 deletions