Age | Commit message (Collapse) | Author |
|
Introduce `fd_lock' rwlock(9) and use it for `fd_fbufs_in' fuse buffers
queue and `fd_rklist' knotes list protection.
Tested by Rafael Sadowski.
Discussed with and ok from bluhm
|
|
mechanical 'selinfo' to 'klist' replacement in 'vnode' structure because
knote(9) API is already used.
<sys/selinfo.h> headers added where is was required.
ok bluhm
|
|
ok miod@ millert@
|
|
selinfo is just wrapper to klist. netstat(1) and libkvm use socket
structure, but don't touch so_{snd,rcv}.sb_sel.
ok visa@
|
|
receive buffer. As it was done for SS_CANTSENDMORE bit, the definition
kept as is, but now these bits belongs to the `sb_state' of receive
buffer. `sb_state' ored with `so_state' when socket data exporting to the
userland.
ok bluhm@
|
|
This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state.
Opposing the previous reverted diff, the SS_CANTSENDMORE definition left
as is, but it used only with `sb_state'. `sb_state' ored with original
`so_state' when socket's data exported to the userland, so the ABI kept as
it was.
Inputs from deraadt@.
ok bluhm@
|
|
|
|
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state. Introduce `sb_state' and
turn SS_CANTSENDMORE to SBS_CANTSENDMORE. This bit will be processed on
`so_snd' buffer only.
Move SS_CANTRCVMORE and SS_RCVATMARK bits with separate diff to make
review easier and exclude possible so_rcv/so_snd mistypes.
Also, don't adjust the remaining SS_* bits right now.
ok millert@
|
|
c99 6.11.5:
"The placement of a storage-class specifier other than at the beginning
of the declaration specifiers in a declaration is an obsolescent
feature."
ok miod@ tb@
|
|
Also remove unneeded includes of <sys/poll.h> and <sys/select.h>.
Some addenda from jsg@.
OK miod@ mpi@
|
|
`so_lock' rwlock(9) instead of global `unp_lock' which locks the whole
layer.
The PCB of unix(4) sockets are linked to each other and we need to lock
them both. This introduces the lock ordering problem, because when the
thread (1) keeps lock on `so1' and trying to lock `so2', the thread (2)
could hold lock on `so2' and trying to lock `so1'. To solve this we
always lock sockets in the strict order.
For the sockets which are already accessible from userland, we always
lock socket with the smallest memory address first. Sometimes we need to
unlock socket before lock it's peer and lock it again.
We use reference counters for prevent the connected peer destruction
during to relock. We also handle the case where the peer socket was
replaced by another socket.
For the newly connected sockets, which are not yet exported to the
userland by accept(2), we always lock the listening socket `head' first.
This allows us to avoid unwanted relock within accept(2) syscall.
ok claudio@
|
|
OK mpi@
|
|
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@
|
|
OK mpi@
|
|
OK mpi@
|
|
This avoids verb overlap with f_modify.
|
|
This enables the system deliver POLLHUP when pollfd.events == 0.
|
|
|
|
Restrict the circumstances where EVFILT_EXCEPT filters trigger:
* when out-of-band data is present and NOTE_OOB is requested.
* when the channel is fully closed and consumer is poll(2).
This should clarify the logic and suppress events that kqueue-based
poll(2) does not except.
OK mpi@
|
|
Currently, the only intended direct usage of the EVFILT_EXCEPT filter
is with NOTE_OOB to detect out-of-band data in ptys and sockets.
NOTE_OOB does not apply to FIFOs or pipes. Prevent the user from
registering the filter with these file types. The filter code is for
the kernel's internal use.
OK mpi@
|
|
Pass the device vnode as a parameter to VOP_STRATEGY() to allow calling
the correct vop_strategy callback. Now the vnode is also available
in the callback.
OK mpi@
|
|
Make __EV_POLL specific to kqueue-based poll(2), to remove overlap
with __EV_SELECT that only select(2) uses.
OK millert@ mpi@
|
|
Prevent select(2) from indicating an exceptional condition when the
other end of a FIFO or pipe is closed.
Originally, select(2) returned an exceptfds event only with a pty or
socket that has out-of-band data pending. millert@ says that OpenBSD
diverged from this by accident when poll(2) and select(2) were changed
to use the same backend code in year 2003.
OK millert@
|
|
The given set of fds are converted to equivalent kevents using EV_SET(2)
and passed to the scanning internals of kevent(2): kqueue_scan().
ktrace(1) will now output the converted kevents on top of the usuals set
bits to be able to find possible error in the convertion.
This switch implies that poll(2) and select(2) will now query underlying
kqfilters instead of the *_poll() routines. An increase in latency is
visible, especially with UDP sockets and NET_LOCK()-contended subsystems
and will be addressed in next steps.
Based on similar work done on MacOS and DragonFlyBSD with inputs from
visa@, millert@, anton@, cheloha@, thanks!
Tested by many, thanks!
ok claudio@, bluhm@
|
|
The filterops instances already provide f_modify and f_process
callbacks with proper internal locking. Locking of socket klists
has been the missing detail for MP-safety.
OK mpi@
|
|
This is a change of behavior and events wont be generated if there
is something to read on the fd. Only EV_EOF or NOTE_OOB will now
be reported.
While here a new filter for FIFOs supporting EV_EOF and __EV_HUP.
ok visa@
|
|
ok mpi@ visa@ (as part of larger diff)
|
|
These functions are only stubs (returning 0). Replace them with nullop
function (same behaviour). There is no intented behaviour changes.
While here, reorder some vop_islocked member in structs to be next
others vop_{,un}lock members.
ok visa@
|
|
when calling namei(), cn_pnbuf is kept allocated when fs
implementation is setting SAVENAME flag. In such cases, it is expected
the fs implementation to also release memory when work is done.
fuse(4) didn't put back the memory to the pool. correct it.
ok mpi@
|
|
It replaces spec_badop, fifo_badop, dead_badop and mfs_badop, which
are only calls to panic(9), to one unique function vop_generic_badop().
No intented behaviour changes (outside the panic message which isn't
the same).
ok mpi@
|
|
OK millert@ mpi@
|
|
OK mvs@
|
|
without the KERNEL_LOCK.
This moves VXLOCK and VXWANT to a mutex protected v_lflag field and also
v_lockcount is protected by this mutex.
The vn_lock() dance is overly complex and all of this should probably replaced
by a proper lock on the vnode but such a diff is a lot more complex. This
is an intermediate step so that at least some calls can be modified to grab
the KERNEL_LOCK later or not at all.
OK mpi@
|
|
ok mpi@
|
|
|
|
OK mpi@ as part of a larger diff
|
|
Rename klist_{insert,remove}() to klist_{insert,remove}_locked().
These functions assume that the caller has locked the klist. The current
state of locking remains intact because the kernel lock is still used
with all klists.
Add new functions klist_insert() and klist_remove() that lock the klist
internally. This allows some code simplification.
OK mpi@
|
|
OK millert@
|
|
ok visa@, millert@
|
|
This is only done in poll-compatibility mode, when __EV_POLL is set.
ok visa@, millert@
|
|
While here prefix kernel-only EV flags with two underbars.
Suggested by kettenis@, ok visa@
|
|
Adapt FS kqfilters to always return true when the flag is set and bypass
the polling mechanism of the NFS thread.
While here implement a write filter for NFS.
ok visa@
|
|
ok visa@
|
|
Prevent generating events that do not correspond to how the fifo has been
opened.
ok visa@, millert@
|
|
Make EVFILT_WRITE notifications on fifo work.
ok visa@, millert@
|
|
for example, with locking assertions.
OK mpi@, anton@
|
|
ok kettenis@
|
|
adding more filter properties without cluttering the struct.
OK mpi@, anton@
|
|
into read-only data segment.
OK deraadt@ tedu@
|
|
ok bluhm@
|