summaryrefslogtreecommitdiff
path: root/sbin/fsck/SMM.doc/2.t
diff options
context:
space:
mode:
Diffstat (limited to 'sbin/fsck/SMM.doc/2.t')
-rw-r--r--sbin/fsck/SMM.doc/2.t267
1 files changed, 267 insertions, 0 deletions
diff --git a/sbin/fsck/SMM.doc/2.t b/sbin/fsck/SMM.doc/2.t
new file mode 100644
index 00000000000..a4abd91fe4a
--- /dev/null
+++ b/sbin/fsck/SMM.doc/2.t
@@ -0,0 +1,267 @@
+.\" $NetBSD: 2.t,v 1.2 1995/03/18 14:56:08 cgd Exp $
+.\"
+.\" Copyright (c) 1982, 1993
+.\" The Regents of the University of California. All rights reserved.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\" notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\" notice, this list of conditions and the following disclaimer in the
+.\" documentation and/or other materials provided with the distribution.
+.\" 3. All advertising materials mentioning features or use of this software
+.\" must display the following acknowledgement:
+.\" This product includes software developed by the University of
+.\" California, Berkeley and its contributors.
+.\" 4. Neither the name of the University nor the names of its contributors
+.\" may be used to endorse or promote products derived from this software
+.\" without specific prior written permission.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+.\" SUCH DAMAGE.
+.\"
+.\" @(#)2.t 8.1 (Berkeley) 6/5/93
+.\"
+.ds RH Overview of the file system
+.NH
+Overview of the file system
+.PP
+The file system is discussed in detail in [Mckusick84];
+this section gives a brief overview.
+.NH 2
+Superblock
+.PP
+A file system is described by its
+.I "super-block" .
+The super-block is built when the file system is created (\c
+.I newfs (8))
+and never changes.
+The super-block
+contains the basic parameters of the file system,
+such as the number of data blocks it contains
+and a count of the maximum number of files.
+Because the super-block contains critical data,
+.I newfs
+replicates it to protect against catastrophic loss.
+The
+.I "default super block"
+always resides at a fixed offset from the beginning
+of the file system's disk partition.
+The
+.I "redundant super blocks"
+are not referenced unless a head crash
+or other hard disk error causes the default super-block
+to be unusable.
+The redundant blocks are sprinkled throughout the disk partition.
+.PP
+Within the file system are files.
+Certain files are distinguished as directories and contain collections
+of pointers to files that may themselves be directories.
+Every file has a descriptor associated with it called an
+.I "inode".
+The inode contains information describing ownership of the file,
+time stamps indicating modification and access times for the file,
+and an array of indices pointing to the data blocks for the file.
+In this section,
+we assume that the first 12 blocks
+of the file are directly referenced by values stored
+in the inode structure itself\(dg.
+.FS
+\(dgThe actual number may vary from system to system, but is usually in
+the range 5-13.
+.FE
+The inode structure may also contain references to indirect blocks
+containing further data block indices.
+In a file system with a 4096 byte block size, a singly indirect
+block contains 1024 further block addresses,
+a doubly indirect block contains 1024 addresses of further single indirect
+blocks,
+and a triply indirect block contains 1024 addresses of further doubly indirect
+blocks (the triple indirect block is never needed in practice).
+.PP
+In order to create files with up to
+2\(ua32 bytes,
+using only two levels of indirection,
+the minimum size of a file system block is 4096 bytes.
+The size of file system blocks can be any power of two
+greater than or equal to 4096.
+The block size of the file system is maintained in the super-block,
+so it is possible for file systems of different block sizes
+to be accessible simultaneously on the same system.
+The block size must be decided when
+.I newfs
+creates the file system;
+the block size cannot be subsequently
+changed without rebuilding the file system.
+.NH 2
+Summary information
+.PP
+Associated with the super block is non replicated
+.I "summary information" .
+The summary information changes
+as the file system is modified.
+The summary information contains
+the number of blocks, fragments, inodes and directories in the file system.
+.NH 2
+Cylinder groups
+.PP
+The file system partitions the disk into one or more areas called
+.I "cylinder groups".
+A cylinder group is comprised of one or more consecutive
+cylinders on a disk.
+Each cylinder group includes inode slots for files, a
+.I "block map"
+describing available blocks in the cylinder group,
+and summary information describing the usage of data blocks
+within the cylinder group.
+A fixed number of inodes is allocated for each cylinder group
+when the file system is created.
+The current policy is to allocate one inode for each 2048
+bytes of disk space;
+this is expected to be far more inodes than will ever be needed.
+.PP
+All the cylinder group bookkeeping information could be
+placed at the beginning of each cylinder group.
+However if this approach were used,
+all the redundant information would be on the top platter.
+A single hardware failure that destroyed the top platter
+could cause the loss of all copies of the redundant super-blocks.
+Thus the cylinder group bookkeeping information
+begins at a floating offset from the beginning of the cylinder group.
+The offset for
+the
+.I "i+1" st
+cylinder group is about one track further
+from the beginning of the cylinder group
+than it was for the
+.I "i" th
+cylinder group.
+In this way,
+the redundant
+information spirals down into the pack;
+any single track, cylinder,
+or platter can be lost without losing all copies of the super-blocks.
+Except for the first cylinder group,
+the space between the beginning of the cylinder group
+and the beginning of the cylinder group information stores data.
+.NH 2
+Fragments
+.PP
+To avoid waste in storing small files,
+the file system space allocator divides a single
+file system block into one or more
+.I "fragments".
+The fragmentation of the file system is specified
+when the file system is created;
+each file system block can be optionally broken into
+2, 4, or 8 addressable fragments.
+The lower bound on the size of these fragments is constrained
+by the disk sector size;
+typically 512 bytes is the lower bound on fragment size.
+The block map associated with each cylinder group
+records the space availability at the fragment level.
+Aligned fragments are examined
+to determine block availability.
+.PP
+On a file system with a block size of 4096 bytes
+and a fragment size of 1024 bytes,
+a file is represented by zero or more 4096 byte blocks of data,
+and possibly a single fragmented block.
+If a file system block must be fragmented to obtain
+space for a small amount of data,
+the remainder of the block is made available for allocation
+to other files.
+For example,
+consider an 11000 byte file stored on
+a 4096/1024 byte file system.
+This file uses two full size blocks and a 3072 byte fragment.
+If no fragments with at least 3072 bytes
+are available when the file is created,
+a full size block is split yielding the necessary 3072 byte
+fragment and an unused 1024 byte fragment.
+This remaining fragment can be allocated to another file, as needed.
+.NH 2
+Updates to the file system
+.PP
+Every working day hundreds of files
+are created, modified, and removed.
+Every time a file is modified,
+the operating system performs a
+series of file system updates.
+These updates, when written on disk, yield a consistent file system.
+The file system stages
+all modifications of critical information;
+modification can
+either be completed or cleanly backed out after a crash.
+Knowing the information that is first written to the file system,
+deterministic procedures can be developed to
+repair a corrupted file system.
+To understand this process,
+the order that the update
+requests were being honored must first be understood.
+.PP
+When a user program does an operation to change the file system,
+such as a
+.I write ,
+the data to be written is copied into an internal
+.I "in-core"
+buffer in the kernel.
+Normally, the disk update is handled asynchronously;
+the user process is allowed to proceed even though
+the data has not yet been written to the disk.
+The data,
+along with the inode information reflecting the change,
+is eventually written out to disk.
+The real disk write may not happen until long after the
+.I write
+system call has returned.
+Thus at any given time, the file system,
+as it resides on the disk,
+lags the state of the file system represented by the in-core information.
+.PP
+The disk information is updated to reflect the in-core information
+when the buffer is required for another use,
+when a
+.I sync (2)
+is done (at 30 second intervals) by
+.I "/etc/update" "(8),"
+or by manual operator intervention with the
+.I sync (8)
+command.
+If the system is halted without writing out the in-core information,
+the file system on the disk will be in an inconsistent state.
+.PP
+If all updates are done asynchronously, several serious
+inconsistencies can arise.
+One inconsistency is that a block may be claimed by two inodes.
+Such an inconsistency can occur when the system is halted before
+the pointer to the block in the old inode has been cleared
+in the copy of the old inode on the disk,
+and after the pointer to the block in the new inode has been written out
+to the copy of the new inode on the disk.
+Here,
+there is no deterministic method for deciding
+which inode should really claim the block.
+A similar problem can arise with a multiply claimed inode.
+.PP
+The problem with asynchronous inode updates
+can be avoided by doing all inode deallocations synchronously.
+Consequently,
+inodes and indirect blocks are written to the disk synchronously
+(\fIi.e.\fP the process blocks until the information is
+really written to disk)
+when they are being deallocated.
+Similarly inodes are kept consistent by synchronously
+deleting, adding, or changing directory entries.
+.ds RH Fixing corrupted file systems