Age | Commit message (Collapse) | Author |
|
Rename klist_{insert,remove}() to klist_{insert,remove}_locked().
These functions assume that the caller has locked the klist. The current
state of locking remains intact because the kernel lock is still used
with all klists.
Add new functions klist_insert() and klist_remove() that lock the klist
internally. This allows some code simplification.
OK mpi@
|
|
reverts changes from msdosfs_vfsops.c rev 1.7
Prompted by a patch from John Carmack to add an an error path when exFAT
is detected on mount to give a more helpful error message.
Returning EINVAL in the existing sanity checks will make mount_msdos(8)
print "not an MSDOS filesystem" when attempting to mount exFAT and
matches historic and documented behaviour.
ok kn@
|
|
While here prefix kernel-only EV flags with two underbars.
Suggested by kettenis@, ok visa@
|
|
Adapt FS kqfilters to always return true when the flag is set and bypass
the polling mechanism of the NFS thread.
While here implement a write filter for NFS.
ok visa@
|
|
for example, with locking assertions.
OK mpi@, anton@
|
|
CID 1452873
|
|
- ufs_chown() & ufs_chmod()
- ufs_reclaim()
- ext2fs_chown() & ext2fs_chmod()
- ntfs_ntget() & ntfs_ntput()
- ntfs_vgetex(), ntfs_ntlookup() & ntfs_ntlookupfile()
While here use `ap->a_p' directly when it is only required to re-enter
the VFS layer in order to help reducing the loop.
ok visa@
|
|
adding more filter properties without cluttering the struct.
OK mpi@, anton@
|
|
|
|
into read-only data segment.
OK deraadt@ tedu@
|
|
make the structs const so that the data are put in .rodata.
OK mpi@, deraadt@, anton@, bluhm@
|
|
OK visa@
|
|
This support is undocumented, only works if you're using the kernel
timezone, and breaks during a DST shift. It also preferences file systems
managed by a Windows installation: many implementations, like ours, use
UTC by default (think: phones, digital cameras).
No complaints on tech@.
"good riddance" tedu@, "Yep." deraadt@
|
|
serializing both read/write operations using the existing file mutex.
The vnode lock still grants exclusive write access to the offset; the
mutex is only used to make the actual write atomic and prevent any
concurrent reader from observing intermediate values.
ok mpi@ visa@
|
|
|
|
|
|
https://marc.info/?l=openbsd-cvs&m=156277704122293&w=2
ok anton@
|
|
as part of the effort to unlock the kernel. Instead of relying on the
vnode lock, introduce a dedicated lock per file. Exclusive write access
is granted using the new foffset_enter and foffset_leave API. A
convenience function foffset_get is also available for threads that only
need to read the current offset.
The lock acquisition order in vn_write has been changed to match the one
in vn_read in order to avoid a potential deadlock. This change also gets
rid of a documented race in vn_read().
Inspired by the FreeBSD implementation.
With help and ok mpi@ visa@
|
|
structure allows for better tracking of pending lock operations which is
essential in order to prevent a use-after-free once the underlying vnode is
gone.
Inspired by the lockf implementation in FreeBSD.
ok visa@
Reported-by: syzbot+d5540a236382f50f1dac@syzkaller.appspotmail.com
|
|
implementations. Rely on the VFS layer to do the checking.
OK mpi@, helg@
|
|
unlocking the directory vnode.
OK mpi@, helg@
|
|
OK mpi@
|
|
With sf@, inputs from krw@, tested by many, ok visa@
|
|
unnecessary because curproc always does the locking.
OK mpi@
|
|
curproc that does the locking or unlocking, so the proc parameter
is pointless and can be dropped.
OK mpi@, deraadt@
|
|
to appease WITNESS. ext2fs and ffs already use the flag. The same
locking pattern appears with other file systems too, so this patch
addresses the remaining cases.
OK mpi@
|
|
are pushed to disk. Dangling vnodes (unlinked files still in use) and
vnodes undergoing change by long-running syscalls are identified -- and
such filesystems are marked dirty on-disk while we are suspended (in case
power is lost, a fsck will be required). Filesystems without dangling or
busy vnodes are marked clean, resulting in faster boots following
"battery died" circumstances.
Tested by numerous developers, thanks for the feedback.
|
|
ok millert@ sthen@
|
|
ok deraadt@ krw@
|
|
ok millert@ krw@
|
|
for blocks re-fetchable from the filesystem. However at reboot time,
filesystems are unmounted, and since processes lack backing store they
are killed. Since the scheduler is still running, in some cases init is
killed... which drops us to ddb [noted by bluhm]. Solution is to convert
filesystems to read-only [proposed by kettenis]. The tale follows:
sys_reboot() should pass proc * to MD boot() to vfs_shutdown() which
completes current IO with vfs_busy VB_WRITE|VB_WAIT, then calls VFS_MOUNT()
with MNT_UPDATE | MNT_RDONLY, soon teaching us that *fs_mount() calls a
copyin() late... so store the sizes in vfsconflist[] and move the copyin()
to sys_mount()... and notice nfs_mount copyin() is size-variant, so kill
legacy struct nfs_args3. Next we learn ffs_mount()'s MNT_UPDATE code is
sharp and rusty especially wrt softdep, so fix some bugs adn add
~MNT_SOFTDEP to the downgrade. Some vnodes need a little more help,
so tie them to &dead_vnops.
ffs_mount calling DIOCCACHESYNC is causing a bit of grief still but
this issue is seperate and will be dealt with in time.
couple hundred reboots by bluhm and myself, advice from guenther and
others at the hut
|
|
(forgot to commit fat.h)
Add new CLUST_END and use it as parameter to pcbmap() when searching
for end cluster, instead of explicitly passing 0xffff. This fixes potential
problem for FAT32, where cluster number may be legally bigger than 0xffff.
Also change clusteralloc() so that fillwith is not explicitly passed by caller
anymore (there is no need to use anything other than CLUST_EOFE).
From NetBSD commit by jdolecek@NetBSD.org
ok tb@ mpi@
|
|
Add new CLUST_END and use it as parameter to pcbmap() when searching
for end cluster, instead of explicitly passing 0xffff. This fixes potential
problem for FAT32, where cluster number may be legally bigger than 0xffff.
Also change clusteralloc() so that fillwith is not explicitly passed by caller
anymore (there is no need to use anything other than CLUST_EOFE).
From NetBSD commit by jdolecek@NetBSD.org
ok tb@ mpi@
|
|
* add to comments for pcbmap()
* remove useless ";"
ok tb@
|
|
This has again caused regressions, this time when reading from msdosfs.
This reverts
denode.h 1.31
msdosfs_vnops.c 1.114
Requested by deraadt@
|
|
ok deraadt@
|
|
This is the reverted commit by mpi@ from msdosfs_vnops.c 1.105 plus some
additional tweaks to fix some cluster/block number confusion that lead
to regressions when seeking past the end of a file.
The original commit message was:
The logic used in msdosfs_bmap() to loop calling pcbmap() comes from
FreeBSD and is not really efficient but it is good enough since it is
only called when generating I/O.
With this diff I get a 100% improvement when reading big files from a
crappy USB stick.
With this and bread_cluster(9) modified to not re-fetch B_CACHED buffers,
reading large contiguous files with chunk sizes of MAXPHYS is almost as
fast as physio(9) on the same device.
For a 'real world' example, when copying music files from a USB stick I
see a speed jump from 15MB/s on -current to 24Mb/s with this diff.
While here rename some 'lbn' variables into 'cn' to better reflect what
we're dealing with.
Tested by Mathieu, with support from deraadt@
ok mpi@
|
|
on amd64 and i386.
|
|
has been fixed in FreeBSD in 2002. No binary change.
From Alexander von Gernler; OK krw@
|
|
file system. In modern images the field is not set properly and
the value is not used anyway. FreeBSD has removed the check already
in 2008.
From Alexander von Gernler; OK krw@
|
|
|
|
|
|
This caused garbage to be written instead of blocks of 0-bytes if a
file is extended by seeking past the end. This happens for example
when extracting files containing lots of 0-bytes with tar.
ok mpi@
Details of original commit:
msdosfs_vnops.c revision 1.105
denode.h revision 1.28
date: 2016/01/13 10:00:55; author: mpi; commitid: ru9jHQwQX09BC5Bw;
Implement VFS read clustering for MSDOSFS.
The logic used in msdosfs_bmap() to loop calling pcbmap() comes from
FreeBSD and is not really efficient but it is good enough since it is
only called when generating I/O.
With this diff I get a 100% improvement when reading big files from a
crappy USB stick.
With this and bread_cluster(9) modified to not re-fetch B_CACHED buffers,
reading large contiguous files with chunk sizes of MAXPHYS is almost as
fast as physio(9) on the same device.
For a 'real world' example, when copying music files from a USB stick I
see a speed jump from 15MB/s on -current to 24Mb/s with this diff.
While here rename some 'lbn' variables into 'cn' to better reflect what
we're dealing with.
Tested by Mathieu, with support from deraadt@
|
|
ok kettenis@ krw@ natano@ dlg@ espie@
|
|
trivial change to use rrw locks instead. All it needs is LK_* defines
for the RW_* flags.
tested by naddy and sthen on package building infrastructure
input and ok jmc mpi tedu
|
|
usb stack was busy, the kernel could trigger an uvm fault. There
is a race between vop_generic_revoke() and sys_mount() where vgonel()
could reset v_specinfo. Then v_specmountpoint is no longer valid.
So after sleeping, msdosfs_mountfs() could crash in the error path.
The code in the different *_mountfs() functions was inconsistent,
implement the same check everywhere.
OK krw@ natano@
|
|
for readable directories, while making it subject to the mask option
(-m in mount_msdos), so it is still possible to mount with
non-executable directories, but with semantics that are easier to
comprehend.
This makes directory listings with default mount options work again.
ok deraadt@
|
|
noone should be executing a binary from a msdos filesystem, considering
the mountpoint tracking permission mode model
ok natano krw
|
|
msdosfs and nfsv2 don't set f_namemax. ntfs and ext2fs don't set
f_namemeax and f_favail. fusefs doesn't set f_mntfromspec, f_favail and
f_iosize. Also, make all filesystems use copy_statfs_info(), so that all
statfs information is filled in correctly for the (sb != &mp->mnt-stat)
case.
ok stefan
|
|
could end up in an inconsistent state. The fstype dependent
mp->mnt_data was NULL, but the general mp was still listed as a
valid mount point. Next access to the file system would crash with
a NULL pointer dereference.
If closing the device fails, the mount point must go away anyway.
There is nothing we can do about it. Remove the workaround for the
EIO error in the general unmount code, but do not generate any error
in the file system specific unmount functions.
OK natano@ beck@
|