summaryrefslogtreecommitdiff
path: root/sys/kern
AgeCommit message (Collapse)Author
2015-09-13Introduce sched_barrier(9), an interface that acts as a scheduler barrier inMark Kettenis
the sense that it guarantees that the specified CPU went through the scheduler. This also guarantees that interrupt handlers running on that CPU will have finished when sched_barrier() returns. ok miod@, guenther@
2015-09-11back out refcnt for dv_ref, there's too many hand crafted devices allDavid Gwynne
over the tree. much encouragement from l2k15
2015-09-11unbreak build on UP kernels.David Gwynne
found by deraadt@
2015-09-11make srp use refcnts so it can use refcnt_finalize instead ofDavid Gwynne
sleep_setup/sleep_finish.
2015-09-11use refcnts for the device reference counts as an example of howDavid Gwynne
refcnt(9) can be used.
2015-09-11introduce a wrapper around reference counts called refcnt.David Gwynne
its basically atomic inc/dec, but it includes magical sleep code in refcnt_finalise that is better written once than many times. refcnt_finalise sleeps until all references are released and does so with sleep_setup and sleep_finalize, which is fairly subtle. putting this in now so i we can get on with work in the stack, a proper discussion about visibility and how available intrinsics should be in the kernel can happen after next week. with help from guenther@ ok guenther@ deraadt@ mpi@
2015-09-11Hoist all the GPT header checks into gpt_chk_header(). Tweak remainingKenneth R Westerback
logic a bit so that an invalid primary header/partition entries table does not cause readgptlabel() to exit before the secondary header is tried.
2015-09-11Convert _TM_ flags to TAME_ flags, collapsing the entire mappingTheo de Raadt
layer because the strings select the right options. Mechanical conversion. ok guenther
2015-09-11Move all prototypes of gpt helper functions to top of file. RenameKenneth R Westerback
get_fstype() to gpt_get_fstype() as it moves.
2015-09-11Shuffle some variables around, add a couple, and eliminate hordesKenneth R Westerback
of repeated lehto32() and lehtoh64() in readgptlabel() to make code more readable.
2015-09-11Move initialization of count of spoofed GPT partitions closerKenneth R Westerback
to use.
2015-09-11GPT partitions cannot start at offset 0. Eliminate the variableKenneth R Westerback
tracking our discovery of the first OpenBSD partition (ourpart) and just use the variable holding the offset of the first OpenBSD partition (gptpartoff). Move initialization of gptpartoff and gptpartend closer to their use and set them when the first OpenBSD partition is found. Thus eliminating a later 'if' statement.
2015-09-11remove some bits of srp.h i had pasted in here by accidentDavid Gwynne
2015-09-11KNF shuffling of local declarations in readgptlabel().Kenneth R Westerback
2015-09-11The must be no space after the syslog priority in the sendsyslog(2)Alexander Bluhm
dropped message error log. OK benno@
2015-09-11readgptlabel() is called from readdoslabel() so there is no needKenneth R Westerback
for readgptlable() to re-check that the label d_secpercyl and d_secsize are not 0.
2015-09-11Spoof EFI SYSTEM GPT partitions as MSDOS partitions. As is doneKenneth R Westerback
with MBR EFI SYSTEM partitions.
2015-09-11Now that interrupt-safe uvm maps are porperly locked, the interrupt-safeMark Kettenis
multi page backend allocator implementation no longer needs to grab the kernel lock. ok mlarkin@, dlg@
2015-09-11Eliminate use-once variable in readgptlabel() and just use theKenneth R Westerback
function value the variable was being set to.
2015-09-11Add ddb ps/o, displaying just the non-idle on-proc threadsPhilip Guenther
ok deraadt@
2015-09-11Only include <sys/tame.h> in the .c files that need itPhilip Guenther
ok deraadt@ miod@
2015-09-11Don't spoof GPT OpenBSD partitions. Simply record and use the first oneKenneth R Westerback
found, as is done in MBR processing.
2015-09-11Change device locators type from int to long, for the sake of 64-bit portsMiod Vallat
without proper device trees. Be sure to build and install config(8) and rerun it before attempting to build a kernel. ok kettenis@ deraadt@ jasper@ visa@
2015-09-10sizes for free(); ok sthenTheo de Raadt
2015-09-10Now that the GPT code tries really hard not to get in the way andKenneth R Westerback
accidentally capture disks ... Eliminate kernel option GPT and associated #ifdef GPT/#endif. Let everybody get on the GPT bandwagon and we'll see what wheels fly off. Requested by & ok deraadt@
2015-09-10Call readgptlabel() from readdoslabel() instead of MD readdisklabel().Kenneth R Westerback
Call it if and only if there is an MBR on sector 0 that contains 1 and only 1 partition; that partition is an EFI partition; and it covers the entire disk or as much of the disk as can be covered in an MBR partition. Be paranoid about restoring any possible tweaks to the label being built in the case that readgptlabel() fails, and in that case return to the readdoslabel() code. ok deraadt@
2015-09-10Don't stop spoofing GPT partitions when the OpenBSD partition isKenneth R Westerback
found. Keep going until we spoof 8 or run out of partitions needing spoofing.
2015-09-09No need to set d_npartitions in readdoslabel() or readgptlabel().Kenneth R Westerback
It has already been initialized in the MD readdisklabel() routines when they call initdisklabel(). ok deraadt@
2015-09-09syncTheo de Raadt
2015-09-09Move to next tame() API. The flags are now passed as a very simple string,Theo de Raadt
which results in tame() code placements being much more recognizeable. tame() can be moved to unistd.h and does not need cpp symbols to turn the bits on and off. The resulting API is a bit unexpected, but simplifies the mapping to enabling bits in the kernel substantially. vague ok's from various including guenther doug semarie
2015-09-09implement a singly linked list built with SRPs.David Gwynne
this allows us to build lists of things that can be followed by multiple cpus. ok mpi@ claudio@
2015-09-08Give the pool page allocator backends more sensible names. We now have:Mark Kettenis
* pool_allocator_single: single page allocator, always interrupt safe * pool_allocator_multi: multi-page allocator, interrupt safe * pool_allocator_multi_ni: multi-page allocator, not interrupt-safe ok deraadt@, dlg@
2015-09-08Now that msleep(9) no longer requires the kernel lock (as long as PCATCHMark Kettenis
isn't specified) the default backend allocator implementation no longer needs to grab the kernel lock. ok visa@, guenther@
2015-09-07Delete ktracing of context switches: it's unused, and not particularly useful,Philip Guenther
and doing VOP_WRITE() from inside tsleep/msleep makes the locking too complicated, making it harder to move forward on MP changes. ok deraadt@ kettenis@
2015-09-06We no longer need to grab the kernel lock for allocating and freeing pagesMark Kettenis
in the (default) single page pool backend allocator. This means it is now safe to call pool_get(9) and pool_put(9) for "small" items while holding a mutex without holding the kernel lock as well as these functions will no longer acquire the kernel lock under any circumstances. For "large" items (where large is larger than 1/8th of a page) this still isn't safe though. ok dlg@
2015-09-04Make every subsystem using a radix tree call rn_init() and pass theMartin Pieuchot
length of the key as argument. This way every consumer of the radix tree has a chance to explicitly initialize the shared data structures and no longer rely on another subsystem to do the initialization. As a bonus ``dom_maxrtkey'' is no longer used an die. ART kernels should now be fully usable because pf(4) and IPSEC properly initialized the radix tree. ok chris@, reyk@
2015-09-03Fix !INET6 build.Martin Pieuchot
2015-09-02To make logging to local syslog reliable, log a message about failedAlexander Bluhm
log atempts. sendsyslog(2) is a good place to detect and report the problem. OK deraadt@
2015-09-01the special check logic for /usr/share/nls/../libc.cat became failureTheo de Raadt
to return failure. open() of these paths should succeed to satisfy strerror() and friends. ok semarie
2015-09-01Corrects a use-after-free in tame_namei().Sebastien Marie
ok doug@
2015-09-01Push down the KERNEL_LOCK/KERNEL_UNLOCK calls into the back-end allocatorMark Kettenis
functions. Note that these calls are deliberately not added to the special-purpose back-end allocators in the various pmaps. Those allocators either don't need to grab the kernel lock, are always called with the kernel lock already held, or are only used on non-MULTIPROCESSOR platforms. pk tedu@, deraadt@, dlg@
2015-09-01mattieu baptiste reported a problem with bpf+srps where the per cpuDavid Gwynne
hazard pointers were becoming corrupt and therefore panics. the problem turned out to be that bridge_input calls if_input on behalf of a hardware interface which then calls bpf_mtap at splsoftnet, while the actual hardware nic calls if_input and bpf_mtap at splnet. the hardware interrupts ran in the middle of the bpf calls bridge runs at softnet. this means the same srps are being entered and left on the same cpu at different ipls, which led to races because of the order of operations on the per cpu hazard pointers. after a lot of experimentation, jmatthew@ figured out how to deal with this problem without introducing per cpu critical sections (ie, splhigh) calls in srp_enter and srp_leave, and without introducing atomic operations. the solution is to iterate forward through the array of hazard pointers in srp_enter, and backward in srp_leave to clear. if you guarantee that you leave srps in the reverse order to entering them, then you can use the same set of SRPs at different IPLs on the same CPU. the ordering requirement is a problem if we want to build linked data structures out of srps because you need to hold a ref to the current element containing the next srp to use it, before giving up the current ref. we're adding srp_follow() to support taking the next ref and giving up the current one while preserving the structure of the hazard pointer list. srp_follow() does this by reusing the hazard pointer for the current reference for the next ref. both mattieu baptiste and jmatthew@ have been hitting this pretty hard with a tweaked version of srp+bpf that uses srp_follow instead of interleaved srp_enter/srp_leave sequences. neither can reproduce the panics anymore. thanks to mattieu for the report and tests ok jmatthew@
2015-09-01a white space krw could not seeTheo de Raadt
2015-09-01'bogous' is bogus spelling of 'bogus' in debug message.Kenneth R Westerback
2015-09-01Missing letoh64() when checking value of gh_lba_alt.Kenneth R Westerback
2015-08-31Consider getfsstat() a RPATH, even though it has no path in it. We mayTheo de Raadt
want to do the same for fstatfs(), after we handle statfs(). These system calls leak path information, however I am reluctant to add a seperate catagory.
2015-08-31In tame mode, return EPERM for *chown if uid/gid change is not towardsTheo de Raadt
cr_uid/cr_gid (effective ids). Thus, chown(, -1,-1) should work OK, so should chown(, me, -1), etc. With this commited, more people can test.
2015-08-31Rather than killing when *chmod is asked to do setuid/setgid, clearTheo de Raadt
those bits in the request and continue. This is a better posix-subset to give to programs.
2015-08-31Abstract 5 identical code blocks into a readdisksector() function.Kenneth R Westerback
Cleaner, clearer and less error prone. Tested by bmercer@ as part of a larger diff, of which this is the last part. reads ok to jsing@ kettenis@. ok deraadt@.
2015-08-31Rejig the the expression calculating of the address of the diskKenneth R Westerback
sector containing the disklabel, eliminating an unnecessary " * DL_BLKSPERSEC()". Tested by bmercer@ as part of larger diff. Idea from & reads ok to jsing@. ok kettenis@.