Age | Commit message (Collapse) | Author |
|
'sockaddr' structure with socket's address. For key management and route
domain sockets it just returns error.
ok bluhm@
|
|
proper strings, adapt struct acct's ac_comm similarily. While here increase
ac_mem to 32-bits, increase ac_flag from 8 to 32 bits for future extensions,
add ac_pid for forensics, and reorder the structure to avoid compiler pads.
More work remains in the sa(8) command to use ac_pid better.
This is a flag day for the acct file format, new/old files/tools are incompatible.
ok bluhm millert
|
|
including the NUL), in all internal interafaces, and expose this
in ktrace, core, or proc.h visibility.
ok millert
|
|
net/if_pppx.c pointed out by jsg@
ok gnezdo@ deraadt@ jsg@ mpi@ millert@
|
|
|
|
|
|
|
|
cold=2. Use the same strategy in a a similar phase during hibernate.
|
|
phases where sleeps are not allowed, and this used to discover it.
msleep() needs the same check.
|
|
Ok deraadt@ guenther@
|
|
failure modes. Also, pack the code a little bit, easier to read.
|
|
OK mpi@
|
|
ok guenther@ rob@
|
|
and restart the suspend all over again. This was previously done by issuing
a task to the acpi thread, but this is simpler.
(I want to try to duplicate these tests earlier in the resume path...)
|
|
able to react to this suitably.
|
|
Ok deraat@
|
|
with AML parsing outside the acpi thread, the locking-release dance
around wsdisplay_{suspend,resume} can be removed
ok kettenis
|
|
reset the MD state before bailing out. New MD function sleep_abort()
does that.
|
|
ok deraadt@ guenther@
|
|
in sleep_resume(), which seems sensible for other future systems also
|
|
from cursig() to postsig() or the caller itself. This will simplify locking.
Also alter sigactsfree() a bit and move it into process_zap() so ps_sigacts
is always a valid pointer.
OK semarie@
|
|
previously sbchecklowmem() (and sonewconn()) would look at the mbuf
and mbuf cluster pools to see if they were approaching their hard
limits. based on how many mbufs/clusters were allocated against the
limits, socket operations would start to fail with ENOBUFS until
utilisation went down.
mbufs and clusters have changed a lot since then though. there are
now many mbuf cluster pools, not just one for 2k clusters. because
of this the mbuf layer now limits the amount of memory all the mbuf
pools can allocate backend pages from rather than limit the individual
pools. this means sbchecklowmem() ends up looking at the default
pool hard limit, which is UINT_MAX, which in turn means means
sbchecklowmem() probably never applies backpressure. this is made
worse on multiprocessor systems where per cpu caches of mbuf and
cluster pool items are enabled because the number of in use pool
items is distorted by the cpu caches.
this switches sbchecklowmem to looking at the page allocations made
by all the pools instead. the big benefit of this is that the page
allocations are much more representative of the overall mbuf memory
usage in the system. the downside is is that the backend page
allocation accounting does not see idle memory held by pools. pools
cannot release partially free pages to the page backend (obviously),
and pools cache idle items to avoid thrashing on the backend page
allocator. this means the page allocation level is higher than the
memory used by actual in-flight mbufs.
however, this can also be a benefit. the backend page allocation is a
kind of smoothed out "trend" line. mbuf utilisation over short periods
can be extremely bursty because of things like rx ring dequeue and fill
cycles, or large socket sends. if you're trying to grow socket
buffers while these things are happening, luck becomes an important
factor in whether it will work or not. because pools cache idle items,
the backend page utilisation better represents the overall trend
of activity in the system and will give more consistent behaviour here.
this diff is deliberately simple. we're basically going from "no
limits" to "some sort of limit" for sockets again, so keeping the
code simple means it should be easy to understand and tweak in the
future.
ok djm@ visa@ claudio@
|
|
ok kettenis
|
|
|
|
This avoids verb overlap with f_modify.
|
|
Use the f_event callback for checking event state within the pipe
event filters. This enables the same f_modify and f_process functions
to handle the different filter types.
OK anton@
|
|
OK mpi@
|
|
need to do this can do it a few moments later in a different hook
|
|
HIBERNATE that needs to be in MD code.
ok gkoehler
|
|
This splits out the MI sequencing, backing it with per-architecture helper
functions. Further steps will be neccesary because ACPI and MD are too
tightly coupled, but soon we'll be able to use this code for more architectures
(which depends on figuring out the lowest-level cpu sleeping method)
ok kettenis
|
|
this makes it consistent with the rest of the network stack when
determining alignment.
ok bluhm@
|
|
Implement the poll(2) system call on top of the kqueue subsystem.
This obsoletes the old, non-MP-safe poll backend.
On entering poll(2), the new code translates each pollfd array entry
into a set of knotes. When these knotes receive events through kqueue,
the events are translated back to pollfd format.
Entries in the pollfd array can refer to the same file descriptor with
overlapping event masks. To allow such overlap with knotes, use an extra
kn_pollid key that separates knotes of different pollfd entries.
Adapted from DragonFly BSD, initial implementation by mpi@.
Tested in snaps for three weeks.
OK mpi@
|
|
original thread's stack hasn't been used since 2015.
ok miod@ deraadt@
|
|
the parent of ptraced processes. Especially ignore the signal mask set
by sigprocmask(2) in that case. In userret() alter the testcase for
when to call cursig() which is only there to avoid taking the
KERNEL_LOCK when returning from a MP safe syscall. This can be revisited
once cursig() is MP safe.
Problem with debugging signal handlers found by kurt@
Tested and OK kurt@, OK mpi@
|
|
ok deraadt
|
|
apostrophe.
|
|
A few variables in the kernel are only writeable before securelevel is
raised. It makes sense to handle them with less code.
OK sthen@ bluhm@
|
|
arithmetic is undefined behavior. Check that size is positive
before adding to pointer. While there, use NUL char for string
termination.
found by kubsan; joint work with tobhe@; OK millert@
|
|
found by kubsan; joint work with tobhe@; OK miod@
|
|
descriptors for explicit fencing
tested with libdrm's amdgpu_test syncobj timeline tests and vkcube on
intel broadwell with Mesa 21.3 (which hangs without sync file support
after the 'anv: Assume syncobj support' Mesa commit)
feedback and ok visa@
|
|
If the first mbuf of a chain in m_pullup is a cluster, check if the
cluster is read-only (shared or an external buffer). If so, don't
touch it and create a new mbuf for the pullup data.
This restores original 4.4BSD m_pullup, that not only returned
contiguous mbuf data of the specified length, but also converted
read-only clusters into writeable memory. The latter feature was
lost during some refactoring.
from ehrhardt@; tested by weerd@; OK stsp@ bluhm@ claudio@
|
|
The previous limit of VM_PHYSSEG_MAX ranges (16) was proving too small for
newer machines. This diff reorganizes the hibernate signature block to allow
for 22 ranges by removing the kernel version comparison and replacing it
with a SHA of several unique kernel features (the version string and several
addresses of functions not inside the same .o).
Reported by claudio@, who also helped fix some issues in the diff. Input
from deraadt@ as well.
Tested by myself and claudio on a variety of machines. Only compile tested on
i386 as I have no more S4-capable i386 hardware anymore.
ok claudio@
|
|
|
|
which was unlocked with accept(2) unlocking. For key management and
route domain sockets it just copies the read-only data.
ok bluhm@
|
|
Changes the way printf debug is done in kern_unveil.c
Currently, each printf() is enclosed in #ifdef DEBUG_UNVEIL. It moves
to using DPRINTF(), and reduces the number of #ifdef inside the file.
Also changes some strings to use __func__ instead of using the
function name verbatim.
ok visa@
|
|
unveil(2). It is not set for nodes that are added as a result of a file
being added via unveil(2). Use this flag to test if backtracking should
be done or not. Also introduce UNVEIL_MASK which checks if any user flags
are set and is used to properly return EACCES vs ENOENT.
This fixes a problem where unveil("/", "r") & unveil("/usr/bin/id", "rx")
cause an error when read accessing "/usr/bin". It also makes sure that
unveil(path, "") will return ENOENT for any access of anything under path.
Reported by and OK semarie@
|
|
pass in the already read hibernate_info instead of reading it again.
ok deraadt@
|
|
it's the 'b' slice and (sanity) check against the partition count.
Also, make the "is union hibernate_info too large?" a compile time
check.
ok deraadt@
|
|
resume. This fixes setups where a umass device no longer attaching
at resume results in a softraid device being renumbered so the
hibernate-time device is no longer correct
ok mlarkin@ jsing@
|
|
|