src - OpenBSD base system

Age	Commit message (Collapse)	Author
2023-03-08	Delete obsolete /* ARGSUSED */ lint comments.	Philip Guenther
	ok miod@ millert@
2022-08-23	msdosfs: don't pass NULL proc pointer to detrunc()	Scott Soule Cheloha
	detrunc()'s proc pointer argument may be passed to vinvalbuf(9), which under certain conditions will pass the given proc pointer to VOP_FSYNC(9), which always asserts that the given proc pointer is equal to curproc. msdosfs_write(), msdosfs_inactive(), createde(), and deextend() all pass NULL for detrunc()'s proc pointer argument. I have no idea why. If these detrunc() calls ever reach VOP_FSYNC(9) the kernel will panic. So, for example, any user with write access to an msdosfs partition can panic the kernel by writing to the partition until they cause ENOSPC. That particular panic looks like this: panic: kernel diagnostic assertion "p == curproc" failed: file "/usr/src/sys/kern/vfs_vops.c", line 305 Stopped at db_enter+0xa: popq %rbp TID PID UID PRFLAGS PFLAGS CPU COMMAND *500294 8955 0 0x100003 0 1K ksh db_enter() at db_enter+0xa panic(ffffffff81f1b0cf) at panic+0xc4 __assert(ffffffff81fa361c,ffffffff81ee8329,131,ffffffff81f7229b) at assert+0x3b VOP_FSYNC(fffffd8449a78b30,ffffffffffffffff,1,0) at VOP_FSYNC+Oxd6 vinvalbuf(fffffd8449a78b30,3,ffffffffffffffff,0,0,ffffffffffffffff) at vinvalbuf+0xd5 detrunc(ffff80000186f900,1fe,0,ffffffffffffffff,0) at detrunc+0x239 msdosfs_write(ffff800055774b98) at msdosf_write+0x4a4 VOP_WRITE(fffffd8449a78b30,ffff800055774d10,3,fffffd8370e8d5d0) at VOP_WRITE+0x59 vn_write(fffffd83c723b860,ffff800055774d10,0) at vn_write+0xc0 dofilewritev(ffff8000556ecfc0,1,ffff800055774d10,0.ffff800055774dc0) at dofilewritev+0x14d sys_write(ffff8000556ecfc0,ffff800055774dd0,ffff800055774dc0) at sys_write+0x6a syscall(ffff800055774e70) at syscall+0x39b Xsyscall() at Xsyscall+0x128 end of kernel end trace frame: 0x7f7ffffd8bf0, count: 2 This patch tweaks all the detrunc() calls in the aforementioned msdosfs functions to pass curproc instead of a NULL pointer to detrunc(). We don't appear to have curproc stashed anywhere in msdosfs_write() or deextend(), so for those calls we explicitly pass curproc. This might have unforseen consequences I can't anticipate. However, with this patch I can no longer panic the kernel by filling an msdosfs partition, which seems like an improvement. With advice from gnezdo@. ok gnezdo@
2022-08-15	remove msdosfs findwin95()	Jonathan Gray
	unused since msdosfs_vfsops.c 1.95 ok miod@ millert@
2022-08-12	Put more struct vnode fields under splbio().	Visa Hankala
	Buffer cache related struct vnode fields can be accessed in interrupt context. Be more consistent with the use of splbio(). OK mpi@
2022-06-26	Remove unused VOP_POLL().	Visa Hankala
	OK mpi@
2022-01-11	spelling	Jonathan Gray
	ok jmc@
2021-12-23	make array bounds in unix2dosfn() prototype match function	Jonathan Gray
	missed when unix2dosfn() was changed with msdosfs_conv.c rev 1.15 in 2012
2021-12-12	Add vnode parameter to VOP_STRATEGY()	Visa Hankala
	Pass the device vnode as a parameter to VOP_STRATEGY() to allow calling the correct vop_strategy callback. Now the vnode is also available in the callback. OK mpi@
2021-12-11	Clarify usage of __EV_POLL and __EV_SELECT	Visa Hankala
	Make __EV_POLL specific to kqueue-based poll(2), to remove overlap with __EV_SELECT that only select(2) uses. OK millert@ mpi@
2021-11-13	Use long filenames by default on FAT filesystems	Klemens Nanni
	These days, 8.3 filenames are often a problem, filesystems containing firmware with long names must not truncate them -- it's also a sane default as portable file system between OSes, anyway. Altough undocumented in mount_msdos(8), the default for FAT32 already is to use long filenames: ever since its import from NetBSD in 1998. Previously, mount_msdos would ignore long filenames and default to short filenames unless a flag was used or long ones were found on the filesystem prior to mounting it. Just always mount with support for long filenames (unless `-s' is used). As various install media use FAT filesystems, adjust the remaining ones to also pass explicit mount option reflecting the previous default. OK deraadt
2021-07-11	correct comment	Jonathan Gray
	from Jonathan Kollasch in NetBSD
2021-03-11	spelling	Jonathan Gray

2020-12-25	Refactor klist insertion and removal	Visa Hankala
	Rename klist_{insert,remove}() to klist_{insert,remove}_locked(). These functions assume that the caller has locked the klist. The current state of locking remains intact because the kernel lock is still used with all klists. Add new functions klist_insert() and klist_remove() that lock the klist internally. This allows some code simplification. OK mpi@
2020-08-10	consistently return EINVAL on invalid BPB	Jonathan Gray
	reverts changes from msdosfs_vfsops.c rev 1.7 Prompted by a patch from John Carmack to add an an error path when exFAT is detected on mount to give a more helpful error message. Returning EINVAL in the existing sanity checks will make mount_msdos(8) print "not an MSDOS filesystem" when attempting to mount exFAT and matches historic and documented behaviour. ok kn@
2020-06-11	Rename poll-compatibility flag to better reflect what it is.	Martin Pieuchot
	While here prefix kernel-only EV flags with two underbars. Suggested by kettenis@, ok visa@
2020-06-08	Use a new EV_OLDAPI flag to match the behavior of poll(2) and select(2).	Martin Pieuchot
	Adapt FS kqfilters to always return true when the flag is set and bypass the polling mechanism of the NFS thread. While here implement a write filter for NFS. ok visa@
2020-04-07	Abstract the head of knote lists. This allows extending the lists,	Visa Hankala
	for example, with locking assertions. OK mpi@, anton@
2020-03-24	Kill some dead code that tests bits immediately after setting them.	Kenneth R Westerback
	CID 1452873
2020-02-27	Remove unused "struct proc *" argument from the following functions:	Martin Pieuchot
	- ufs_chown() & ufs_chmod() - ufs_reclaim() - ext2fs_chown() & ext2fs_chmod() - ntfs_ntget() & ntfs_ntput() - ntfs_vgetex(), ntfs_ntlookup() & ntfs_ntlookupfile() While here use `ap->a_p' directly when it is only required to re-enter the VFS layer in order to help reducing the loop. ok visa@
2020-02-20	Replace field f_isfd with field f_flags in struct filterops to allow	Visa Hankala
	adding more filter properties without cluttering the struct. OK mpi@, anton@
2020-01-24	remove a notyet that remains more not than yet after 25 years. ok krw	Ted Unangst

2020-01-20	struct vops is not modified during runtime so use const which moves each	Claudio Jeker
	into read-only data segment. OK deraadt@ tedu@
2019-12-31	Use C99 designated initializers with struct filterops. In addition,	Visa Hankala
	make the structs const so that the data are put in .rodata. OK mpi@, deraadt@, anton@, bluhm@
2019-12-26	Convert struct vfsops initializer to C99 style.	Alexander Bluhm
	OK visa@
2019-09-04	msdosfs: remove timezone support	cheloha
	This support is undocumented, only works if you're using the kernel timezone, and breaks during a DST shift. It also preferences file systems managed by a Windows installation: many implementations, like ours, use UTC by default (think: phones, digital cameras). No complaints on tech@. "good riddance" tedu@, "Yep." deraadt@
2019-08-05	Allow concurrent reads of the f_offset field of struct file by	anton
	serializing both read/write operations using the existing file mutex. The vnode lock still grants exclusive write access to the offset; the mutex is only used to make the actual write atomic and prevent any concurrent reader from observing intermediate values. ok mpi@ visa@
2019-07-25	vinvalbuf(9): tlseep -> tsleep_nsec(9); ok millert@	cheloha

2019-07-19	getblk(9): tsleep(9) -> tsleep_nsec(9); ok visa@	cheloha

2019-07-12	Revert anton@ changes about read/write unlocking	solene
	https://marc.info/?l=openbsd-cvs&m=156277704122293&w=2 ok anton@
2019-07-10	Make read/write of the f_offset field belonging to struct file MP-safe;	anton
	as part of the effort to unlock the kernel. Instead of relying on the vnode lock, introduce a dedicated lock per file. Exclusive write access is granted using the new foffset_enter and foffset_leave API. A convenience function foffset_get is also available for threads that only need to read the current offset. The lock acquisition order in vn_write has been changed to match the one in vn_read in order to avoid a potential deadlock. This change also gets rid of a documented race in vn_read(). Inspired by the FreeBSD implementation. With help and ok mpi@ visa@
2019-01-21	Introduce a dedicated entry point data structure for file locks. This new data	anton
	structure allows for better tracking of pending lock operations which is essential in order to prevent a use-after-free once the underlying vnode is gone. Inspired by the lockf implementation in FreeBSD. ok visa@ Reported-by: syzbot+d5540a236382f50f1dac@syzkaller.appspotmail.com
2018-06-21	Drop redundant "node == parent node" checks from VOP_RMDIR()	Visa Hankala
	implementations. Rely on the VFS layer to do the checking. OK mpi@, helg@
2018-06-07	Make callers of VOP_CREATE(9) and VOP_MKNOD(9) responsible for	Visa Hankala
	unlocking the directory vnode. OK mpi@, helg@
2018-05-27	Drop unnecessary `p' parameter from vget(9).	Visa Hankala
	OK mpi@
2018-05-07	Implement VFS read clustering for MSDOSFS, take 3.	Martin Pieuchot
	With sf@, inputs from krw@, tested by many, ok visa@
2018-05-02	Remove proc from the parameters of vn_lock(). The parameter is	Visa Hankala
	unnecessary because curproc always does the locking. OK mpi@
2018-04-28	Clean up the parameters of VOP_LOCK() and VOP_UNLOCK(). It is always	Visa Hankala
	curproc that does the locking or unlocking, so the proc parameter is pointless and can be dropped. OK mpi@, deraadt@
2018-03-28	Use RWL_IS_VNODE with locks that are acquired through VOP_LOCK(),	Visa Hankala
	to appease WITNESS. ext2fs and ffs already use the flag. The same locking pattern appears with other file systems too, so this patch addresses the remaining cases. OK mpi@
2018-02-10	Syncronize filesystems to disk when suspending. Each mountpoint's vnodes	Theo de Raadt
	are pushed to disk. Dangling vnodes (unlinked files still in use) and vnodes undergoing change by long-running syscalls are identified -- and such filesystems are marked dirty on-disk while we are suspended (in case power is lost, a fsck will be required). Filesystems without dangling or busy vnodes are marked clean, resulting in faster boots following "battery died" circumstances. Tested by numerous developers, thanks for the feedback.
2018-01-02	Stop assuming <sys/file.h> will pull in fcntl.h when _KERNEL is defined.	Philip Guenther
	ok millert@ sthen@
2017-12-30	Don't pull in <sys/file.h> just to get fcntl.h	Philip Guenther
	ok deraadt@ krw@
2017-12-30	Delete unnecessary <sys/file.h> includes	Philip Guenther
	ok millert@ krw@
2017-12-11	In uvm Chuck decided backing store would not be allocated proactively	Theo de Raadt
	for blocks re-fetchable from the filesystem. However at reboot time, filesystems are unmounted, and since processes lack backing store they are killed. Since the scheduler is still running, in some cases init is killed... which drops us to ddb [noted by bluhm]. Solution is to convert filesystems to read-only [proposed by kettenis]. The tale follows: sys_reboot() should pass proc * to MD boot() to vfs_shutdown() which completes current IO with vfs_busy VB_WRITE\|VB_WAIT, then calls VFS_MOUNT() with MNT_UPDATE \| MNT_RDONLY, soon teaching us that *fs_mount() calls a copyin() late... so store the sizes in vfsconflist[] and move the copyin() to sys_mount()... and notice nfs_mount copyin() is size-variant, so kill legacy struct nfs_args3. Next we learn ffs_mount()'s MNT_UPDATE code is sharp and rusty especially wrt softdep, so fix some bugs adn add ~MNT_SOFTDEP to the downgrade. Some vnodes need a little more help, so tie them to &dead_vnops. ffs_mount calling DIOCCACHESYNC is causing a bit of grief still but this issue is seperate and will be dealt with in time. couple hundred reboots by bluhm and myself, advice from guenther and others at the hut
2017-08-14	msdofs: Add new CLUST_END constant	Stefan Fritsch
	(forgot to commit fat.h) Add new CLUST_END and use it as parameter to pcbmap() when searching for end cluster, instead of explicitly passing 0xffff. This fixes potential problem for FAT32, where cluster number may be legally bigger than 0xffff. Also change clusteralloc() so that fillwith is not explicitly passed by caller anymore (there is no need to use anything other than CLUST_EOFE). From NetBSD commit by jdolecek@NetBSD.org ok tb@ mpi@
2017-08-14	msdofs: Add new CLUST_END constant	Stefan Fritsch
	Add new CLUST_END and use it as parameter to pcbmap() when searching for end cluster, instead of explicitly passing 0xffff. This fixes potential problem for FAT32, where cluster number may be legally bigger than 0xffff. Also change clusteralloc() so that fillwith is not explicitly passed by caller anymore (there is no need to use anything other than CLUST_EOFE). From NetBSD commit by jdolecek@NetBSD.org ok tb@ mpi@
2017-08-13	minor msdosfs tweaks	Stefan Fritsch
	* add to comments for pcbmap() * remove useless ";" ok tb@
2017-06-13	Revert 'Implement VFS read clustering for MSDOSFS' again	Stefan Fritsch
	This has again caused regressions, this time when reading from msdosfs. This reverts denode.h 1.31 msdosfs_vnops.c 1.114 Requested by deraadt@
2017-05-29	msdosfs & ffs: flush cache if updating mount from r/w to r/o	Stefan Fritsch
	ok deraadt@
2017-05-29	Implement VFS read clustering for MSDOSFS	Stefan Fritsch
	This is the reverted commit by mpi@ from msdosfs_vnops.c 1.105 plus some additional tweaks to fix some cluster/block number confusion that lead to regressions when seeking past the end of a file. The original commit message was: The logic used in msdosfs_bmap() to loop calling pcbmap() comes from FreeBSD and is not really efficient but it is good enough since it is only called when generating I/O. With this diff I get a 100% improvement when reading big files from a crappy USB stick. With this and bread_cluster(9) modified to not re-fetch B_CACHED buffers, reading large contiguous files with chunk sizes of MAXPHYS is almost as fast as physio(9) on the same device. For a 'real world' example, when copying music files from a USB stick I see a speed jump from 15MB/s on -current to 24Mb/s with this diff. While here rename some 'lbn' variables into 'cn' to better reflect what we're dealing with. Tested by Mathieu, with support from deraadt@ ok mpi@
2017-04-20	Tweak lock inits to make the system runnable with witness(4)	Visa Hankala
	on amd64 and i386.