summaryrefslogtreecommitdiff
path: root/sys/kern
AgeCommit message (Collapse)Author
2021-10-04Simplify sys___thrsigdivert a bit. cursig() always moves the pending signalClaudio Jeker
to p_siglist and so there is no need to check ps_siglist for the signal. OK mpi@
2021-10-04Use the fact the vnodes are locked when operations are inflight.Claudio Jeker
Remove the v_inflight member and alter the ffs and ext2fs sync code to track inflight by checking if the node is locked or not (which it already did before but for a different reason). OK mpi@
2021-10-02remove dead variable from sys___realpath()Sebastien Marie
it is a leftover from LOCKPARENT removal in NDINIT() (in rev 1.337) ok mpi@
2021-10-02vfs: merge *_badop to vop_generic_badopSebastien Marie
It replaces spec_badop, fifo_badop, dead_badop and mfs_badop, which are only calls to panic(9), to one unique function vop_generic_badop(). No intented behaviour changes (outside the panic message which isn't the same). ok mpi@
2021-09-28Fix timeout behaviour bug introduced in 1.241.Claudio Jeker
If the timespec is zero-valued sys___thrsigdivert() should just do the check for pending signals and return immediatly. OK kettenis@
2021-09-09Add THREAD_PID_OFFSET to tracepoint arguments that pass a TID to userland.Martin Pieuchot
Bring these values in sync with the `tid' builtin which already include the offset. This is necessary to build script comparing them, like: tracepoint:sched:enqueue { @ts[arg0] = nsecs; } tracepoint:sched:on__cpu /@ts[tid]/ { latency = nsecs - @ts[tid]; } Discussed with and ok bluhm@
2021-09-09Move a check to avoid panicing on contended rwlock(9) outside of DIAGNOSTIC.Martin Pieuchot
ok kettenis@
2021-09-09No need to initialize nuv, it is assigned to before use.Claudio Jeker
2021-09-05Introduce dummy pagers for 'special' subsystems using UVM objects.Martin Pieuchot
Some pmaps (x86, hppa) and the buffer cache rely on UVM objects to allocate and manipulate pages. These objects should not be manipulated by uvm_fault() and do not currently require the same locking enforcement. Use the dummy pagers to explicitly document which UVM functions are meant to manipulate UVM objects (uobj) that do not need the upcoming `vmobjlock' and instead still rely on the KERNEL_LOCK(). Tested by many as part of a larger diff. ok kettenis@, beck@
2021-09-03add kprobes provider for dtJasper Lievisse Adriaanse
this allows us to dynamically trace function boundaries with btrace by patching prologues and epilogues with a breakpoint upon which the handler records the data, sends it back to userland for btrace to consume. currently it's hidden behind DDBPROF, and there is still a lot to cleanup and improve, but basic scripts that observe return codes from a probed function work. from Tom Rollet, with various changes by me feedback and ok mpi@
2021-09-02Refactor how unveil generates EACCES errors. Instead of tracking theClaudio Jeker
possible violation during the traversal of the path do the check at the end. Make the code a bit easier to grok. OK beck@ semarie@
2021-08-31Honour netinet6 when generating symlinks to tags filesKlemens Nanni
"make tags" needs "make links" to have tags available in subdirectories and netinet6 has been missing all the time. OK tb
2021-08-31Swap lock flags so that LK_EXCLUSIVE is first like in all other places.Claudio Jeker
2021-08-31printing the hibernate image size in MB is easier on the eyesTheo de Raadt
ok mlarkin
2021-08-30increase hibernate writeout speed a little. modern machines have vastTheo de Raadt
tracts of unused memory, and the empty-space RLE scanner (uvm_page_rle) would rescan for empty space needlessly wasting excessive cpu time 16G machine, 100sec -> 9sec 40G machine, 325sec -> 28sec with kettenis mlarkin
2021-08-30Make sure unveil remains locked over fork even in the case where theClaudio Jeker
parent just called unveil(NULL, NULL) and nothing else. With and OK beck@
2021-08-02Don't call cpu_setperf() when reading hw.setperf.Theo Buehler
"makes perfect sense to me" chris ok gnezdo jca
2021-07-26Pass a socket pointer to various socket buffer routines in preparation forMartin Pieuchot
per-socket locking. No functional change.
2021-07-25Kill unused sbinsertoob().Martin Pieuchot
ok mvs@
2021-07-24Modifying a knote must be done with the corresponding lock held. AssertMartin Pieuchot
that the KERNEL_LOCK() is held unless the filter is marked as MPSAFE. Should help finding missing locks when unlocking various filters. ok visa@
2021-07-22Make kqpoll_dequeue() usable with lazy removal of knotesVisa Hankala
Adjust kqpoll_dequeue() so that it will clear only badfd knotes when called from kqpoll_init(). This is needed by kqpoll's lazy removal of knotes. Eager removal in kqpoll_dequeue() would defeat kqpoll's attempt to reuse previously established knotes under workloads where knote activation tends to occur already before next kqpoll scan. Prompted by mpi@
2021-07-16Remove the unveil current directory pointer from struct process. InsteadClaudio Jeker
pass in the vnode to unveil_start_relative() like it is done for *at() syscalls. This fixes an issue with fchdir() that actually did not correctly reset this pointer when changing the working directory. OK beck@
2021-07-15UNVEIL_INSPECT is no longer needed, adjust code accordingly.Claudio Jeker
OK semarie@
2021-07-14After VFS shutdown, init(8) cannot map a missing page that containsAlexander Bluhm
the signal handler code. Traditionally a process would spin in such a case, but we changed the logic in revision 1.167 trapsignal() to receive a fatal signal. If that happens to init(8), the kernel panics. In case of reboot, jump between init signal handler and page fault trap until the kernel resets the machine. reported and tested weerd@; OK deraadt@
2021-07-08whitespace fixes, no code change.Mike Larkin
2021-07-08Remove the code to store intermediary vnodes in the unveil list.Claudio Jeker
These traversed vnodes are a leftover from early times where realpath(3) was still all done in userland. OK semarie@
2021-07-06Introduce CPU_IS_RUNNING() and us it in scheduler-related code to preventMark Kettenis
waiting on CPUs that didn't spin up. This will allow us to spin down CPUs in the future to save power as well. ok mpi@
2021-07-03__realpath: removes LOCKLEAF from NDINIT.Sebastien Marie
The code doesn't doesn't need it: the returned vnode is released immediately. The string path is built from the namei() call using REALPATH, during directories traversal. Without LOCKLEAF, calling vrele() only is enough if namei() found a file, instead of calling VOP_UNLOCK() + vrele(). ok claudio@ mpi@
2021-07-02Writing ktrace files to NFS must no be done while holding the netAlexander Bluhm
lock. accept(2) panics, connect(2) dead locks. Additionally copy in or out must not hold the net lock as it may be a memory mapped file on NFS. Simplify dns_portcheck(), it does not modify namelen anymore. In doaccept() release the socket lock before calling copyaddrout(). Rearrange the checks in sys_connect() like they are in sys_bind(). OK mpi@
2021-06-30Remove unused variable cryptodesc_pool. Document global variablesAlexander Bluhm
in crypto.c and annotate locking protection. Assert kernel lock where needed. Remove dead code from crypto_get_driverid(). Move crypto_init() prototype into header file. OK mpi@
2021-06-29Didn't intend to commit the CPU_IS_RUNNING() changes just yet, so revertMark Kettenis
those bits.
2021-06-29SMP support. Mostly works, but occasionally craps out during boot.Mark Kettenis
ok drahn@
2021-06-29Adjust unveil_find_cover() to return -1 if the root vnode is passed in.Claudio Jeker
This helps unveil_add_vnode() to properly re-evaluate unveils when "/" is added to the list. Because of this adjust unveil_covered() to check for the root as well so that in that case the unveil uv is returned instead of NULL. Traversing up from the root returns the root. This check is not really needed since namei has its own root check and shortcuts for root vnodes. OK semarie@
2021-06-29remove arch ifdefs around drm.h includeJonathan Gray
ok deraadt@ kettenis@
2021-06-26Add powerpc64 and riscv64 to the list of architectures that have DRM.Mark Kettenis
ok matthieu@, deraadt@, jsg@
2021-06-24unveil: cleanup code. no intented functional change.Sebastien Marie
return early for simple conditions instead of using navigating inside if-branches. with and ok claudio@
2021-06-23In unveil_add_vnode() refactor code around the indexes i and j. In oneClaudio Jeker
place the wrong index is used resulting in re-evaluating all unveil nodes. Also loop over over all but the last (just added vnode) -- again there is no need to re-evaluate the cover of the just added unveil. OK anton@ semarie@
2021-06-19timecounting: add FRAC_TO_NSEC(), BINTIME_TO_NSEC()cheloha
Refactor the fraction-to-nanosecond conversion from BINTIME_TO_TIMESPEC() into a dedicated routine, FRAC_TO_NSEC(), so we can reuse it elsewhere. Then add a new BINTIME_TO_NSEC() function to sys/time.h to deduplicate conversion code in nsecuptime(), getnsecuptime(), and tc_setclock(). Thread: https://marc.info/?l=openbsd-tech&m=162376993926751&w=2 ok dlg@
2021-06-19timeout(9): change argument order for timeout_set_kclock()cheloha
Move the kclock argument before the flags argument. XORing a bunch of flags together may "sprawl", and I'd rather have any sprawl at the end of the parameter list. timeout_set_kclock() is undocumented and there is only one caller, so no big refactor required. Best to do this argument order shuffle before any bigger refactors of e.g. timeout_set(9).
2021-06-18setitimer(2): increase timer limit to UINT_MAX secondscheloha
Currently setitimer(2) rejects timers larger than 100 million seconds and sets EINVAL. With the change to kclock timeouts there is no longer any reason to use this arbitrary value. Kclock timeouts support the full range of a timespec, so we can increase the upper bound without practical risk of arithmetic overflow. If we push the limit to UINT_MAX we can support the full input range of alarm(3). We can then simplify the alarm.3 manpage in a separate patch. We can push the limit even higher in the future if we find software that doesn't like the UINT_MAX limit. Until then, UINT_MAX seconds (over 68 years) is plenty for all practical timers. ok claudio@
2021-06-16kqueue: kq_lock is needed when updating kn_statusVisa Hankala
The kn_status field of struct knote is part of kqueue's internal state. When kn_status is being updated, kq_lock has to be locked. This is true even with MP-unsafe event filters. OK mpi@
2021-06-16Change the prefix of UVM object functions to match NetBSD's.Martin Pieuchot
For example uvm_objinit() becomes uvm_obj_init(). Reduce differences between the trees and help porting new functions needed for UVM object locking. No functionnal change.
2021-06-15Remove the uvshrink logic and keep the unveil list in the order of insertion.Claudio Jeker
unveil_lookup() is now doing a dumb linear search. The problem with the uvshrink logic was that ps_uvpcwd was a pointer into this array and after compation it pointed to the wrong element. Also future unveil caches would suffer from the same issue. OK semarie@
2021-06-15factor out nsecuptime and getnsecuptime.David Gwynne
these functions were implemented in a bunch of places with comments saying it should be moved to kern_tc.c when more pop up, and i was about to add another one. i think it's time to move them to kern_tc.c. ok cheloa@ jmatthew@
2021-06-13Back off a couple of the more paranoid checks while spoofing GPTKenneth R Westerback
partitions into the disklabel. First, since the alt header is never accessed there is no need to worry about it being inaccessible. Second, the GPT header claiming to cover more sectors than the device has is no reason to ignore all the partitions. The partition actually present could still be useful. Issues encountered in the wild by mlarkin@ while accessing some disk images. ok deraadt@
2021-06-11setitimer(2): don't round up it_valuecheloha
We can reduce latency for the first expiration of a timer if we don't round it_value up to the minimum interval (1 tick). While we're at it, we may as well consolidate all input validation and adjustment into a single itimerfix() call. There are no other callers in the kernel (nor should there be), so remove the prototype from sys/time.h. Discussion: https://marc.info/?l=openbsd-tech&m=162084338005502&w=2 Tested by weerd@ and claudio@. probably ok claudio@
2021-06-11Remember to lock kqueue mutex in filt_timermodify().Visa Hankala
Reported-by: syzbot+c2aba7645a218ce03027@syzkaller.appspotmail.com
2021-06-10Serialize internals of kqueue with a mutexVisa Hankala
Extend struct kqueue with a mutex and use it to serializes the internals of each kqueue instance. This should make possible to call kqueue's system call interface without the kernel lock. The event source facing side of kqueue should now be MP-safe, too, as long as the event source itself is MP-safe. msleep() with PCATCH still requires the kernel lock. To manage with this, kqueue_scan() locks the kernel temporarily for the section that may sleep. As a consequence of the kqueue mutex, knote_acquire() can lose a wakeup when klist_invalidate() calls it. To preserve proper nesting of mutexes, knote_acquire() has to release the kqueue mutex before it unlocks klist. This early unlocking of the mutex lets badly timed wakeups go unnoticed. However, the system should not hang because the sleep has a timeout. Tested by gnezdo@ and mpi@ OK mpi@
2021-06-09unveil: small cleanup for UNVEIL_INSPECTSebastien Marie
remove two leftover checks which were used when ni_unveil was used with UNVEIL_INSPECT. it was used by: - readlink(2) - removed 2019-08-31 - stat(2) and access(2) - removed 2019-03-24 ok claudio@
2021-06-07Kill SS_ASYNC and only check SB_ASYNC when async signals are wanted.Martin Pieuchot
This socket flag was redundant with the socket buffer one. ok mvs@