Age | Commit message (Collapse) | Author |
|
for use by the linux compatibility APIs in drm(4).
While I hate infecting code in sys/kern with this, untangling all the
of having different types and different signedness is too much for me
right now. The best strategy may be to change ticks itself to be long
but that needs some careful auditing.
ok deraadt@
|
|
for sa_len and sa_family is provided. This will make handling of
socket name mbufs within the kernel safer.
issue reported by Ilja Van Sprundel; OK claudio@
|
|
if TIOCGPGRP fail.
Issue found by Ilja van Sprundel.
ok bluhm@, millert@, deraadt@
|
|
splicing, another process may allocate it in the meantime. Then
one of the splicing structures leaked in sosplice(). Recheck that
no struct sosplice exists after a protential sleep.
reported by Ilja Van Sprundel; OK mpi@
|
|
and `so_rcv'.
ok bluhm@, claudio@, visa@
|
|
ok bluhm@, claudio@, visa@
|
|
Found by Ilja Van Sprundel
ok kettenis
|
|
uninitialised data can be dumped into the ktrace message.
Found by Ilja Van Sprundel
OK bluhm@
|
|
Do the same in sendsyslog(2) and document the behavior.
reported by Ilja Van Sprundel; OK millert@ deraadt@
|
|
with the socket lock.
This change is safe because sbreserve() already asserts that the lock is
held, but it acts as implicit documentation and indicates that I looked
at the function.
|
|
been acquired in sosend(). Fixes a kernel lock assertion panic.
OK visa@ mpi@
|
|
Buffercache performs read-ahead for cluster reads by extending
the length of an original read operation to the MAXPHYS (64k).
Upon I/O completion, the length is trimmed and the buffer is
returned to the filesystem and the remaining data is cached.
However, under certain circumstances, the underlying hardware
may fail to do a complete I/O operation and return with a non-
zero value of the residual length (i.e. data that wasn't read).
The residual length may exceed the size of an original request
and must be re-adjusted to uphold the contract with the caller,
e.g. the filesystem. At the same time, read-ahead buffers that
cover chunks of memory corresponding to the residual length
must be invalidated and not cached.
Discussed at length during d2k17, ok tedu
|
|
lock.
Prevents a future lock recursion since soo_ioctl() will need to grab
the lock.
ok bluhm@, visa@
|
|
Suggested by and OK dlg@
|
|
of items that a cache list is allowed to hold. This lets the cache
release resources back to the common pool after pressure on the cache
has decreased.
OK dlg@
|
|
this is almost a straightforward change of spl ops with mutex ops,
except the accounting has been shuffled around. memory is counted
as used before an attempt to allocate it from uvm is made to prevent
overcommitting memory. this is modelled on how pools limit allocations.
the uvm bits have been eyeballed by kettenis@ who says they should be safe.
visa@ found some nits which have been fixed.
tested by chris@ and amit kulkarni
ok kettenis@ visa@ mpi@
|
|
It is unsafe to sleep while iterating the list of pending events in
kqueue_scan().
Reported by abieber@ and juanfra@
|
|
add a little breathing room.
|
|
Implicitely protects `so_state' with the socket lock in sosend().
ok visa@, bluhm@
|
|
ok bluhm@, visa@
|
|
ok bluhm@, visa@
|
|
to KNOTE() as we are already holding the lock. Fixes "panic:
rw_enter: netlock locking against myself" reported by Gregor Best
and reproduced with src/regress/lib/libtls/gotls.
OK millert@
|
|
kqueue filters.
ok millert@, bluhm@, visa@
|
|
returns EIO. The base system has been cleaned of TIOCSTI uses (collaboration
between anton and I), and the ports tree appears mostly clean. A few
stragglers may be discovered and cleaned up later...
In a month or so, we should see if the #define can be removed entirely.
ok anton tedu, support from millert
|
|
While here document an abuse of parent socket's lock.
Problem reported by krw@, analysis and ok bluhm@
|
|
buffers.
This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.
Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.
Tested by Hrvoje Popovski.
ok claudio@, bluhm@, mikeb@
|
|
hardcoding 64 is too optimistic.
|
|
previously it would figure out if there's enough items overall for
all the cpus to have full active an inactive free lists. this
included currently allocated items, which pools wont actually hold
on a free list and cannot predict when they will come back.
instead, see if there's enough items in the idle lists in the depot
that could instead go on all the free lists on the cpus. if there's
enough idle items, then we can grow.
tested by hrvoje popovski and amit kulkarni
ok visa@
|
|
so that an unhibernate kernel can detect if it is running with the
kernel it booted.
ok mlarkin
|
|
|
|
|
|
They're might not be fully constructed.
ok mpi@ deraadt@ bluhm@
|
|
to grab the rwlock.
Problem reported by Rivo Nurges.
ok bluhm@
|
|
if the lock around the global depot of extra cache lists is contented
a lot in between the gc task runs, consider growing the number of
entries a free list can hold.
the size of the list is bounded by the number of pool items the
current set of pages can represent to avoid having cpus starve each
other. im not sure this semantic is right (or the least worst) but
we're putting it in now to see what happens.
this also means reality matches the documentation i just committed
in pool_cache_init.9.
tested by hrvoje popovski and amit kulkarni
ok visa@
|
|
found by regress/sys/kern/pledge/generic; OK deraadt@
|
|
the cpu caches in pools amortise the cost of accessing global
structures by moving lists of items around instead of individual
items. excess lists of items are stored in the global pool struct,
but these idle lists never get returned back to the system for use
elsewhere.
this adds a timestamp to the global idle list, which is updated
when the idle list stops being empty. if the idle list hasn't been
empty for a while, it means the per cpu caches arent using the idle
entries and they can be recovered. timestamping the pages prevents
recovery of a lot of items that may be used again shortly. eg, rx
ring processing and replenishing from rate limited interrupts tends
to allocate and free items in large chunks, which the timestamping
smooths out.
gc'ed lists are returned to the pool pages, which in turn get gc'ed
back to uvm.
ok visa@
|
|
this lets pool_cache_list_put return items to the pages. currently,
if pool_cache_list_put is called while the per cpu caches are
enabled, the items on the list will put put straight back onto
another list in the cpu cache. this also avoids counting puts for
these items twice. a put for the items have already been coutned
when the items went to a cpu cache, it doesnt need to be counted
again when it goes back to the pool pages.
another side effect of this is that pool_cache_list_put can take
the pool mutex once when returning all the items in the list with
pool_do_put, rather than once per item.
ok visa@
|
|
|
|
|
|
KERN_POOL_CACHE reports info about the global cache info, like how long
the lists of cache items the cpus build should be and how many of these
lists are idle on the pool struct.
KERN_POOL_CACHE_CPUS reports counters from each each. the counters
are for how many item and list operations the cache has handled on
a cpu. the sysctl provides an array of ncpusfound * struct
kinfo_pool_cache_cpu, not a single struct kinfo_pool_cache_cpu.
tested by hrvoje popovski
ok mikeb@ millert@
----------------------------------------------------------------------
|
|
theyre both wrappers around sysctl__string, which is where half the
fix is too.
|
|
lists of free items on the per cpu caches are built out the pool items
as struct pool_cache_items, not struct pool_cache. make the KASSERT
in pool_cache_init check that properly.
|
|
this tweaks the len argument to sysctl_rdstring, sysctl_struct, and
sysctl_rdstruct.
there's probably more to fix.
ok millert@
|
|
calls. They'll be a little less visible, but still in the system logs.
ok bluhm
|
|
SIGILL, SIGBUS, SIGSEGV signals. Make such memory violations visible
in lastcomm(1). This also works if a programm tries to hide them
with a signal handler. Manual kill -SEGV does not generate false
positives.
OK deraadt@
|
|
maybe this will help prevent misassignment in the future.
|
|
this removes the need for sys/param.h. this code can be built with
only sys/tree.h, which in turn only needs sys/_null.h.
|
|
this are provided so an RBT and it's topology can be copied without
having to reinsert the copied nodes into a new tree.
there are two reasons RBT_LEFT/RIGHT/PARENT macros cant be used like
RB_LEFT/RIGHT/PARENT for this. firstly, RBT_LEFT and co are functions that
return a pointer value, they dont provide access to the pointer
itself for use as an lvalue that you can assign to. secondly, RBT
entries dont store pointers to other nodes, they point to the
RBT_ENTRY structures inside other nodes. this means that RBT_SET_LEFT
and co have to get an offset from the node to the RBT_ENTRY and
store that.
|
|
when something went wrong. This allows to monitor whether the
system is under attack and that the attack has been prevented by
OpenBSD pledge(2).
OK deraadt@ millert@ jmc@
|
|
pfkey and unix sockets.
ok claudio@
|