Linux File System Study Guide

Background reference:
Matt Welsh & Lar Kaufman, Running Linux,
O'Reilly & Associates, 2nd Ed., 1996.
Chapter 6

  1. Explain why a system administrator should know the details of a file manager.

    Besides multi-tasking and multi-user management, file management is the most important task of the operating system. There always has been, and it seems there will always be, a hierarchy of memory technologies where shorter access latency memory costs a lot more than longer access latency memory. The role of the file manager is to hide the details of this hierarchy and automatically move programs and data from slower memory to faster main memory. Just as race car driver must know what engine configuration works best for which type of race, the system administrator must know what file manager configuration works best for the type of work the OS is to perform. Understanding key concepts now will harvest the rewards of a well tuned system later.

  2. Describe the role of a file manager and describe Linux's file manager.

    A file manager organizes the hard disk (partition) into four areas: an administrative description of the hard disk (super block), one or more areas of index lists (inodes), allocated file data pointed to by the index lists (directories and regular files), and lists of yet-to-be-used blocks (the free list).

    Before a file system can be used (mounted), it must be formatted (mkfs) by the file manager. Mounting an unformatted partition or corrupted partition will crash the operating system since it relies upon the file manager for access to all of its utilities.

    Unix implementations have one file manager. Depending on how the kernel is configured, Linux could have as many as 12 file managers! The "native" root file system may be MINIX or ext2. The more common mounted volumes may be proc, NFS, msdos, Novell, OS/s, Windows NT, CD-ROM, or Windows file sharing. There are also file manages for other versions of Unix such as Xenix, System V, and Coherent.

  3. Explain the role of Linux's "special" files and describe five examples.

    Like Unix, Linux uses the file manager to create "system abstractions" or special purpose files. These constructs are not files at all, but kernel device drivers and other "logical" drivers that perform unique functions when accessed as a file.

    1. /dev/null - The "rat hole" dumps input into the "bit bucket" when written and returns end-of-file when read.
    2. /dev/zeros - When read, it returns a buffer full of zeros.
    3. /dev/tty - When accessed, it always point to the program's specific controlling /dev/ttyx.
    4. /dev/hda1 - Provide direct access to the hard disk without file manager interpretation.
    5. /dev/ram - Make high memory look like a hard disk (for rescue).

  4. Explain how the "/proc" abstraction has evolved in Linux compared to other Unix implementations.

    Historically, Unix allowed user access to kernel variables and the currently running user process through /dev/kmem and /dev/umem abstractions. Assuming the kernel or user object modules were not stripped of their symbol tables, a debugger could be used to access to the kernel's or users variables through /dev/kmem or /dev/umem. In more recent versions of Unix, the /proc directory was created and combined with a logical driver. This design was better than before, but it still required "ioctl" service calls to access kernel variables.

    Linux has extended this concept by creating a separate file system manager called "/proc" which is mounted like other file systems. The /proc manager separates parts of the kernel into directories. Accessing a given /proc file yields the variable name and value. The /proc abstraction also permits access to user process as well as device drivers.

  5. Describe how the file system structure and the mount command work to create a transparent file system.

    1. First block of the file system (super block) contains a volume relative index to one or more "inode" sections.
    2. Inodes, in turn, index to directory files and regular files.
    3. First inode always "points" to root directory file.
    4. The booted root file system is "hard wired" into the kernel image (see rdev).
    5. A mounted directory is the root directory of the mounted file system.

    To access a file, the file manager must first decode a series of directories or "path components." As the inode number for each component is discovered in a directory, the mount table is consulted to see if that inode number had been previously used to mount another volume (file system). If the inode number was in the mount table, the contents of the inode are ignored and, instead, inode number one in the new file system is accessed.

  6. Describe the arguments to the mount command and explain the role of the /etc/fstab file. Also, explain why some mount commands without arguments seem to work.

    The Linux command mount -t fstype device mount-directory takes at least three arguments. -t fstype refers to one of the 12 possible file managers, device refers to the device driver that can access specific storage devices such as /dev/fd0, and mount-directory refers to an existing directory in the file hierarchy. Note that any files contained in the directory will be hidden by the mount operation.

    The /etc/fstab configuration file contains a list of file systems and arguments. The mount, umount, fsck commands read this file to discover details about the available file systems.

    file system  mount point  type    options                     dump  pass
    -----------  -----------  ------- --------------------------  ----  ----
    /dev/sda1    /            ext2    defaults,errors=remount-ro  0     1
    /dev/sda2    none         swap    sw                          0     0
    proc         /proc        proc    defaults                    0     0
    /dev/sda3    /usr         ext2    defaults                    0     2
    /dev/fd0     /floppy      msdos   noauto,conv=auto,user       0     0
    /dev/cdrom   /cdrom       iso9660 noauto,ro,user              0     0
    If the mount command is given with just one argument and it matches one of the rows in fstab, the other fields in the row are taken as arguments for the command. Given the above fstab file and the command mount /cdrom, the resulting command would be mount -t cdrom -o ro /dev/cdrom /cdrom. The "noauto" keeps mount from automatically mounting the floppy or CD at boot time while the "user" option allows a non-superuser account mount the floppy or CD.

  7. Explain where the file managers are located and how they are activated.

    File managers are conditionally compiled into the kernel image. To see which managers a given kernel contains, examine the /proc/filesystems file. Its contents might look like this:

    nodev   proc
    nodev   nfs
    indicating that this kernel supports 6 of the 12 file managers.

    A file manager is activated when its file system is entered as part of a path traversal to reach a desired file. Linux usually begins with its native ext2 file manager and switches as it enters the mounted volume.

  8. Explain how the MS-DOS file manager maps incompatibilities between MS-DOS and Linux files.

    Linux (Unix) ASCII files use one character (the LF or \n) to indicate end-of-line. MS-DOS employs two characters (the CRLF or \r\n pair) to show end-of-line. Linux (Unix) employs multi-group permissions and other file attributes while MS-DOS has just a few attributes (hidden, archive, read only).

    The MS-DOS file manager, therefore, must map a Linux (Unix) \n to \r\n as the file is written onto the MS-DOS volume. Coming from a MS-DOS volume, the file manager must give additional permissions (rwxrwxr-x root root) and map \r\n to \n. File manager mappings are controlled by additional arguments added to the mount command. For example, mount -t msdos -o conv=auto /dev/fd0 /floppy tells the file manager not to do the above mapping if the MS-DOS file has a common binary extension such as "exe."

  9. Explain the conflicts that removable media introduce into a multi-user, data cached operating system and possible solutions.

    Use of the mount and umount commands indicate that the file manager maintains "state" information about a volume. Users allocate entries in kernel data structures when they "open" a given file. If a mounted CD-ROM or floppy is removed, then the file manager is unable to complete its accounting information for users accessing the volume. Furthermore, files written to the volume may still be in the data cache and not yet written to the floppy.

    The solution is to prevent a user from removing the media. The eject button on the CD-ROM drive, for example, is disabled while the volume is mounted. In the case of the floppy disk, there is no solution other than to restrict floppy disk access to the console terminal. At this time, Linux allows users access to any mounted volume.

  10. When using the umount command, explain what the error message "/dev/xxx busy" really means.

    The "state" information described above includes a "count" of the number of processes accessing a given inode. If one attempts to unmount an inode with a non-zero reference count, then the inode is "busy" since the other processes wish to read/write to or through the mounted inode.

    The "busy" count could also refer to the person issuing the umount command if their current working directory is on or below the mounted directory.

  11. Explain why one would receive the error message "mount: /dev/cdrom is not a valid block device."

    The mount utility attempts minimal error checking before it really mounts the new volume. Once the first block is read in, its contents are compared to other superblocks to double check that it is a valid file system for that type of file manager.

  12. Explain how the fdformat and mkfs commands differ.

  13. Describe the role of fsck and describe the three basic ways a block could be missing or duplicated in a fully-indexed file system.

    1. Free list block numbers, file block numbers, and inode block numbers may be double referenced in one list, referenced in more that one list, or have no reference.
    2. An allocated inode may not have a directory entry (orphaned file) or a directory entry may reference a free inode.
    3. Size of a file may differ from its block count.

  14. Describe the different types of pages and how they relate to swapping.

    Linux divides up memory into process, shared process, shared library, and data cache areas. After process and library memory has been allocated, the data cache will grow to fill most of memory. As new process are activated, memory space will be reclaimed from the data cache. After a point, the data cache will no longer give up memory and idle pages are moved to the swapping disk to make room for new processes. On the Intel architecture, a "page" is 4096 bytes. The read only shared pages have high priority and tend to stay in memory, but when their space is needed, they are overwritten to improve swapping speed.

  15. Describe the results of the free command.

                 total       used       free     shared    buffers     cached
    Mem:         14092      12288       1804      17456       1100       5588
    Swap:        40156         24      40132
    Physical memory is 16 MB and about 2 MB is used by the kernel. Thus the "total" remaining is 14 MB, and about 12 MB has been "used" or allocated to user programs and 2 MB are "free" or held in reserve for new processes.

    Of the "used" 12 MB, 1 MB has been allocated to "buffers" or permanent data cache while another 5.5 MB of memory has been "cached" or taken over by the data cache mechanism. The remaining 3.5 MB (used - free + buffers + cached) holds about 17 MB of "shared" code. Thus, processes share 3.5 MB of physical memory while executing as if they had 17 MB of memory.

    Even though there is 40 MB of swap space, only 24 KB of it have been used.

  16. Explain the following setup commands:

    # dd if=/dev/zero of=/swapfile bs=1024 count=8192
    # mkswap swapfile 8192
    # sync
    # swapon swapfile
    /dev/zero is a system abstraction that returns a continuous stream of null bytes. The device dump (dd) program reads 8KB of zeros and writes a contiguous (and hopefully sequential) file. mkswap initializes a free list of swap blocks within the file. sync moves file blocks left in the data cache out to the hard disk. And swapon directs the system swapper to being using the file.

  17. Contrast "special," "block special," and "character special" devices.

Copyright © 1998 P. Tobin Maginnis