summaryrefslogtreecommitdiff
path: root/sys/kern
AgeCommit message (Collapse)Author
2017-12-10Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().Martin Pieuchot
SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in contexts related to kqueue(2) where we'd like to avoid grabbing solock(). While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and csignal() to mark which remaining functions need to be addressed in the socket layer. ok visa@, bluhm@
2017-12-09More precision in pledge sysctl reportTheo de Raadt
2017-12-04Change __mp_lock_held() to work with an arbitrary CPU info structure andMartin Pieuchot
extend ddb(4) "ps /o" output to print which CPU is currently holding the KERNEL_LOCK(). Tested by dhill@, ok visa@
2017-12-04Use _kernel_lock_held() instead of __mp_lock_held(&kernel_lock).Martin Pieuchot
ok visa@
2017-11-28Raise the IPL of the sbar taskq to avoid lock order issuesVisa Hankala
with the kernel lock. Fixes a deadlock seen by Hrvoje Popovski and dhill@. OK mpi@, dhill@
2017-11-28deadproc_mutex is only taken _before_ kernel_lock; exclude it fromPhilip Guenther
WITNESS checking as (our) witness code isn't smart enough to let that by. ok visa@
2017-11-28syncPhilip Guenther
2017-11-28Delete fktrace(2). The consequences of it were not thought throughPhilip Guenther
sufficiently and at least one horrific security hole was the result. ok deraadt@ beck@
2017-11-27Fix comment typoPhilip Guenther
2017-11-24add timeout_barrier, which is like intr_barrier and taskq_barrier.David Gwynne
if you're trying to free something that a timeout is using, you have to wait for that timeout to finish running before doing the free. timeout_del can stop a timeout from running in the future, but it doesn't know if a timeout has finished being scheduled and is now running. previously you could know that timeouts are not running by simply masking softclock interrupts on the cpu running the kernel. however, code is now running outside the kernel lock, and timeouts can run in a thread instead of softclock. timeout_barrier solves the first problem by taking the kernel lock and then masking softclock interrupts. that is enough to ensure that any further timeout processing is waiting for those resources to run again. the second problem is solved by having timeout_barrier insert work into the thread. when that work runs, that means all previous work running in that thread has completed. fixes and ok visa@, who thinks this will be useful for his work too.
2017-11-23Constify protocol tables and remove an assert now that ip_deliver() isMartin Pieuchot
mp-safe. ok bluhm@, visa@
2017-11-23We want `sb_flags' to be protected by the socket lock rather than theMartin Pieuchot
KERNEL_LOCK(), so change asserts accordingly. This is now possible since sblock()/sbunlock() are always called with the socket lock held. ok bluhm@, visa@
2017-11-17permit IPV6_V6ONLY in sockoptAaron Bieber
OK deraadt@
2017-11-14Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().Theo Buehler
In particular, this allows SIOCGIF* requests to run in parallel. lots of help & ok mpi, ok visa, sashan
2017-11-14Fix the initial check of the checkorder and lock operationsVisa Hankala
so that statically initialized locks get properly enrolled to the validator. OK mpi@
2017-11-14remove MALLOC_DEBUGDavid Gwynne
the code has rotted, and obviously hasnt been used for ages. it is also hard to make mpsafe. if we need something like this again it would be better to do it from scratch. ok tedu@ visa@
2017-11-13add taskq_barrierDavid Gwynne
taskq_barrier guarantees that any task that was running on the taskq has finished by the time taskq_barrier returns. it is similar to intr_barrier. this is needed for use in ifq_barrier as part of an upcoming change.
2017-11-04raw_init() is dead and <net/raw_cb.h> doesn't need to be included there.Martin Pieuchot
2017-11-04Make it possible for multiple threads to enter kqueue_scan() in parallel.Martin Pieuchot
This is a requirement to use a sleeping lock inside kqueue filters. It is now possible, but not recommended, to sleep inside ``f_event''. Threads iterating over the list of pending events are now recognizing and skipping other threads' markers. knote_acquire() and knote_release() must be used to "own" a knote to make sure no other thread is sleeping with a reference on it. Acquire and marker logic taken from DragonFly but the KERNEL_LOCK() is still serializing the execution of the kqueue code. This also enable the NET_LOCK() in socket filters. Tested by abieber@ & juanfra@, run by naddy@ in a bulk, ok visa@, bluhm@
2017-11-02Move PRU_DETACH out of pr_usrreq into per proto pr_detachFlorian Obser
functions to pave way for more fine grained locking. Suggested by, comments & OK mpi
2017-10-30Let witness(4) differentiate between taskq mutexes to avoidVisa Hankala
reporting an error in a scenario like the following: 1. mtx_enter(&tqa->tq_mtx); 2. IRQ 3. mtx_enter(&tqb->tq_mtx); Found by Hrvoje Popovski, OK mpi@
2017-10-29Move NET_{,UN}LOCK into individual slowtimo functions.Florian Obser
Direction suggested by mpi OK mpi, visa
2017-10-24Use membar_enter_after_atomic(9) amd membar_exit_before_atomic(9).Martin Pieuchot
Micro-optimization useful to x86 archs where the cmpxchg{q,l} instruction used by rw_enter(9) and rw_exit(9) already include an implicit memory barrier. From Mateusz Guzik, ok visa@, mikeb@, kettenis@
2017-10-17Add a machine-independent implementation for the mplock.Visa Hankala
This reduces code duplication and makes it easier to instrument lock primitives. The MI mplock uses the ticket lock code that has been in use on amd64, i386 and sparc64. These are the architectures that now switch to the MI code. The lock_machdep.c files are unhooked from the build but not removed yet, in case something goes wrong. OK mpi@, kettenis@
2017-10-17Print the pid of the most recent program that failed to send a logMartin Pieuchot
via sendsyslog(2) along with the corresponding errno. Help when troubleshooting which program is triggering an error, like an overflow. ok bluhm@
2017-10-14Split sys_ptrace() by request type:Philip Guenther
- control operations: trace_me, attach, detach, step, kill, continue. Manipulate process relation/state or send a signal - kernel-state get/set: thread list, event mask, trace state. About the process and don't require target to be stopped, need copyin/out - user-state get/set: memory, register, window cookie. Often thread-specific, require target to be stopped, need copyin/out sys_ptrace() changes to handle request checking, copyin/out to kernel buffers with size check and zeroing, and dispatching to the routines above for the real work. This simplfies the permission checks and copyin/out handling and will simplify lock handling in the future. Inspired in part by FreeBSD. ok mpi@ visa@
2017-10-12Print the word pledge in the kernel log when there is a violation.Alexander Bluhm
This should make it easier to figure out what is going on. Note that the pledgecode it shows is only a guess which pledge(2) might help. OK deraadt@ semarie@
2017-10-12Use a temporary variable in rw_status() to dereference only once theMartin Pieuchot
volatile member of the struct. Not forcing a memory read on every access, 3 in this function, might reduce cache traffic in some cases. Micro-optimization and diff provided by Mateusz Guzik. ok visa@
2017-10-12Move sysctl_mq() where it can safely mess with mbuf queue internals.Martin Pieuchot
ok visa@, bluhm@, deraadt@
2017-10-11Move `kq_count' increase/decrease close to the corresponding TAILQ_*Martin Pieuchot
insert/remove operation. No functionnal change for the moment. However this helps to make this code mp-safe. Note that markers are still not, and wont be, counted. ok visa@, jsing@, bluhm@
2017-10-11Move kq_kev from struct kqueue to the stack.Martin Pieuchot
It turns this set of events per-thread without having to lock anything. From Dragonfly 10f6680a4f6684751aaae0965abfe140f19e9231 ok kettenis@, visa@, bluhm@
2017-10-09Reduces the scope of the NET_LOCK() in sysctl(2) path.Martin Pieuchot
Exposes per-CPU counters to real parrallelism. ok visa@, bluhm@, jca@
2017-10-09Make _kernel_lock_held() always succeed after panic(9).Martin Pieuchot
ok visa@
2017-10-07In "tty", permitting TIOCSTART is fineTheo de Raadt
2017-10-07permit SYS___set_tcb, upcoming code will require thisTheo de Raadt
2017-09-29New ddb(4) command: kill.Martin Pieuchot
Send an uncatchable SIGABRT to the process specified by the pid argument. Useful in case of CPU exhaustion to kill the DoSing process and generate a core for later inspection. ok phessler@, visa@, kettenis@, miod@
2017-09-27guenther sleep-commited the version without #ifdefsTheo de Raadt
2017-09-27amd64 needs FS.base values (the TCB pointer) to be validated, as noncanonicalPhilip Guenther
addresses will cause a fault on load by the kernel. Problem observed by Maxime Villard ok kettenis@ deraadt@
2017-09-25sendsyslog should take a const char * everywhere.Marc Espie
okay bluhm@, deraadt@
2017-09-15Coverity complains that top == NULL was checked and further downAlexander Bluhm
top->m_pkthdr.len was accessed without check. See CID 1452933. In fact top cannot be NULL there and the condition was always false. m_getuio() did never reserve space for the header. The correct check is m == top to find the first mbuf. OK visa@
2017-09-15Coverity complained that the while loop at the end of m_adj() couldAlexander Bluhm
dereference m if it is NULL. See CID 501458. - Remove the m NULL check from the final for loop, it is not necessary. This cannot happen due to the length calculation. The inconsistent code caused the coverity issue. - Move the m = mp close to all the loops where the mbuf chain is traversed. - Use mp to access the m_pkthdr consistently. - Move the next assignemnt from for (;;m = m->m_next) to the end of the loop to make it consistent to the previous for (;;) where the total length is calculated. OK visa@ mpi@
2017-09-11Coverty complains that the return value of sblock() is not checkedAlexander Bluhm
in sorflush(), but in other places it is. See CID 1453099. The flags SB_NOINTR and M_WAITOK should avoid failure. Put an assert there to be sure. OK visa@ mpi@
2017-09-08If you use sys/param.h, you don't need sys/types.hTheo de Raadt
2017-09-07In elf_load_file() to not call free(9) with an uninitialized sizeAlexander Bluhm
even if the pointer is NULL. This is not a real bug as free(9) checks the addr pointer before the size value, but the compiler cannot know that. found by clang -Wuninitialized; OK deraadt@
2017-09-06Do not pass an uninitialized size value to free(9) even if the addrAlexander Bluhm
pointer is NULL as it may generate false positive warnings. requested by markus@
2017-09-01Change sosetopt() to no longer free the mbuf it receives and changeMartin Pieuchot
all the callers to call m_freem(9). Support from deraadt@ and tedu@, ok visa@, bluhm@
2017-08-29Remove old deactivated pledge path code. A replacement mechanism isTheo de Raadt
being brewed. ok beck
2017-08-27Revisit 2q queue sizes. Limit the hot queue to 1/20th the cache size upBob Beck
to a max of 4096 pages. Limit the warm and cold queues to half the cache. This allows us to more effectively notice re-interest in buffers instead of losing it in a large hot queue. Discussed and shown with claudio@ and benno@ at tk217
2017-08-22Make sogetopt(9) caller responsible for allocating an MT_SOOPTS mbuf.Martin Pieuchot
Move a blocking memory allocation out of the socket lock and create a simpler alloc/free pattern to review. Now both m_get() and m_free() are in the same place. Discussed with bluhm@. Encouragements from deraadt@ and tedu@, ok kettenis@, florian@, visa@
2017-08-22Add some buffercache docsStefan Fritsch
* add clarifications and bread_cluster() buffercache(9) * add some comments to vfs_bio.c ok tedu@