Routine System Administration

Background reference:
Matt Welsh & Lar Kaufman, Running Linux,
O'Reilly & Associates, 2nd Ed., 1996.
Chapter 8

Copyright © 1998-2002 P. Tobin Maginnis
This document is free; you can redistribute it and/or modify it under the terms
of the GNU General Public License as published by the Free Software Foundation.

  1. Define "backup" and give the design tradeoff.

    Backup is more a "process" than a "program." It's the idea that if valuable information is placed on redundant media and the primary medium fails, the information can be copied back to a new instance of the primary medium.

    Unfortunately, many times the backup process degenerates into a romantic ideal that is never realized (i.e., everyone assumes there is a valid backup). A few reasons for un-realized backup are that the backup process requires additional equipment that is not critical to day-to-day operations; it may have been borrowed to support some other day-to-day operation, or the backup device is used in "write" mode but never tested in "read" mode. Another reason for the frequent failure of the backup process is that it requires a significant investment in disciplined labor that performs the backup operation at the proper intervals, without skipping steps, or without overwriting other backup copies in the library.

    The backup device usually takes the form of a tape drive that trades high capacity for very slow sequential access.

    The essence of the backup problem is that even when the backup procedure is done properly, there is no benefit if the backup copies are not dumped and tested. But the catch is that if the backup is tested and failure occurs, it is the same as a catastrophic hardware failure without backup. Finally, if the backup is encoded (archived and compressed), then the data is unaccessible, unless the decoder (uncompressor and unarchiver) can be loaded on the new medium independent of the backup process. In other words, the backup program(s) cannot be saved with the backed up data.

  2. Define "RAID" and explain how it interacts with the backup concept.

    RAID means Redundant Array of Inexpensive Disks and, as the name suggests, it combines hard disks in novel ways to maximize access time or reliability. One form of reliability is RAID-1, which automatically duplicates information on two or more identical hard disks. If one disk fails, an alarm indicates that it should be replaced. After replacement, the other disk contents are copied to the new disk. If backup is simply viewed as a means to ensure reliable hardware storage, then RAID may offer a viable solution and pay for itself in unnecessary labor costs within a year.

    On the other hand, many professionals believe that keeping the system administrator's job depends upon their ability to restore a system that has been corrupted as a result of hardware or human error. Therefore, the only solution for the latter group is to employ strict and systematic operator procedures to implement the backup process.

  3. Define "incremental backup," explain how to restore from a catastrophic hard disk failure, and explain why one may not be able to restore from an incremental backup.

    Incremental backup a process that trades decreased backup time for increased restore time. The process begins with a full or "epoch" backup. In subsequent incremental backups, only the files that have newer create or modify times are backed up. These short backups can only go for so long before using too many backup tapes (usually seven). Then the process begins again by rewriting the epoch tape and reuse of the incremental tapes.

    In the event of failure, the epoch tape is restored and each incremental backup is applied from the oldest to the newest tape (but not beyond the failure date). Since multiple tapes are needed for restore, there is the (high) possibility of operator error. Files can be spread over all the tapes in use since the last epoch backup. Several tapes may have to be searched to find the correct archive file to restore. The archives must be restored in the reverse order of creation. And an intermediate archive may have been overwritten.

  4. Define "differential backup," explain how to restore from a backup.

    Differential backup, also trades decreased backup time for increased restore time, but it is a compromise between epoch and incremental backup procedures. As above, the differential backup begins with a full or "epoch" backup. Thereafter, all files that were created or modified since the last epoch backup are saved with each differential backup. Thus, as the week draws to a close, the differential backup grows larger and larger (but still small compared to the epoch backup).

    In the event of failure, only two tapes (steps) are required to restore the system in the differential backup: the epoch tape and the most recent differential tape. To simplify the process even more, some system administrators use just one tape and initialize it with an epoch. Subsequent differential backups are placed at the end of the same tape until that tape fills and the process is restarted.

  5. Contrast "epoch," "incremental," and "differential" backups and describe file "ghosting."

    Backup procedures are designed to save and restore files and, in general, they do not detect deleted files. This leads to problems when, after a multi-file restore, garbage-file "ghosts" will reappear.

  6. Support or refute the concept of a minimal system backup.

    Support: There are various forms of backup. In one extreme, complete disk images are saved and restored. In the other extreme, only a subset of one user's files are backed up. A minimal system backup is somewhere in the middle where just enough data is saved so that a new system could be rebuilt. There are only few key areas where user and system configuration files are routinely stored. It is possible to save that information, install a new version of the same Linux distribution, and reconfigure the new version by copying over the old configuration files.

    Refute: There is only one reason for backup and that is the restoration of any file lost through machine failure or human error. Therefore, a minimal system backup is an epoch backup of all drives. Eliminating files based on location invites a host of problems. First, no one can predict where a critical file may be stored and subsequently lost. Second, there may well be location and file format conflicts with the new version (distribution) of Linux. For example, the X Window configuration files may move or a different shared C run-time library may be required. Third, rebuilding a system from a new distribution also means that individually installed packages must also be be re-installed and, if any customization had been done to those packages, that information was lost.

  7. Assuming one is willing to risk incompatibilities and does not have customized packages, describe a minimal system backup procedure that could be exported to a new version of Linux.

    A minimal system backup would be global system configuration files and the user areas:

  8. Contrast a physical record versus logical block for devices and relate this to how tape drives are used.

    Generally, a device low-level formats physical records on its medium. In this way, the device hardware can detect if and when the proper physical location is present before user data is read or written. A logical block is an operating system concept that includes a uniform set of commands for all devices and a common buffer size and format for all devices. For efficiency, logical blocks usually contain multiple sequential physical records.

    Tape drives make no distinction between physical records and logical blocks. A tape drive reads physical records, but the size of the record is based upon how many bytes in the logical block were written when the record was created. In other words, if the tape drive is asked to read more or less data than physically exists, then an incomplete physical record is discovered and an I/O error declared, even though there is valid data on the tape.

    Tape drives are simple sequential devices and have no built-in concept of random file access. To give the illusion of random access, two inter-record gaps in sequence are said to represent an end-of-file marker. The tape operator is responsible for counting "gaps" to reach the appropriate file.

    Finally, multi-file tapes are essentially Write Once, Read Many (WORM) devices since an "erased" file would have to be replaced with an identical size file to keep the end-file-gap valid.

  9. Contrast the following individual tar commands (no group operation is implied):
    tar cvf /dev/nrft0 /etc /home /var/spool/mail
    mt /dev/nrft0 rewind
    mt /dev/nrft0 fsf 2
    tar tvf /dev/nrft0

    Accessing the /dev/rft0 tape device driver causes the tape drive to rewind the tape upon completion of the read or write operation. In this way, the user always knows that they are at the beginning of the tape as the next command is typed. But accessing the /dev/nrft0 tape device driver will not rewind the tape drive upon completion of the of the I/O operation, making the user responsible for keeping track of the current tape position.

    Thus, the first command creates an archive of three directories beginning at the current tape position. The tape is not rewound so that subsequent archives could be written following the current archive. The second command rewinds the tape back to the beginning. The third command advances the tape two archive files positioning the R/W head before the third archive file. The last command reads the table of contents of the third archive file.

  10. Support or refute the concept of compressing an archive and offer a compromise compression technique.

    Refute: compression reduces the multi-file archive into one "blob." One bad physical record would prevent the whole archive from being uncompressed and all files would be lost. If the same archive was initially uncompressed, the same bad record would only prevent one of the many files from being extracted.

    Support: compression can shrink an ASCII text archive 5 to 1. To overcome the bad record problem, generate the archive in two places. It still saves 60% over the initial archive and provides added backup reliability. Also, if the bad record occurred in the uncompressed archive's table of contents area, then a significant portion (if not all) of the archive would be lost regardless.

    Compromise: Use a backup system which compresses individual input files, rather than the single output stream. The risk of loss due to a bad physical record at this point is only the file associated with the bad record.

  11. Explain how the following commands perform an incremental backup and describe what is missing.
    cd /
    find . -mtime -1 \! -type d -print > /backup/daily
    find . -ctime -1 \! -type d -print >> /backup/daily
    tar -cv -T /backup/daily -f /dev/rft0

    The command "find / -print" generates absolute path names (i.e., /dir1/dir2/filex) and these type of backups usually present problems if the restored files are to be placed in top-level directory different from where the files were saved. To limit this problem, the first command "cd /" allows find to discover relative paths (i.e. ./dir1/dir2/filex) that may be restored in any top-level directory.

    The second command "find" begins at the system root directory and searches for any file that was modified in the last 24 hours. The backslash "\" tells the shell to ignore the "!" operator and to pass the argument as-is onto the find program where the "! -type d" string tells find not to include directory files in the report. The results are redirected to a file called "daily."

    The third command (second find command) also begins at the system root directory and searches for any file that was created in the last 24 hours and appends (>>) its search results to the file called "daily."

    The fourth command employs "tar" to create the archive by reading file names to be archived from the "daily" file and saving the result on the "raw floppy tape" drive.

    Several things are missing in this example. First, the "-prune /dir" option would also be required in a production backup script since there are many other system files accessed daily such as /etc, /proc, /dev, and /var/log files. Second, directory owners and permissions are not saved. Third, most tar programs can not backup a pathname which is longer than 126 component names.

  12. Explain the role of "cron" and describe how to have the backup command executed on a regular basis.

    The term "cron" refers to a Unix daemon that checks a series of tables (files) to see if it's time to run a job. Tables in /etc/crontab are system-wide, while tables in /var/spool/cron/useraccount are specific to that user. To automate the backup process, edit the cron table with the "crontab -e" command. The table has six columns that include minute, hour, day-of-month, month, and day-of-week followed with the command to be executed at the specified interval. Note that the command may reference an executable shell script in your home directory. Entries in each column specify a point in time based upon the column interval such as once a given minute, once an hour, once a day of the week, etc.

    Please note that a row is evaluated from left to right until a time expression is satisfied (the remaining columns are skipped). Also, multiple entries among minutes and hours versus day-of-the-month, month, and day-of-week are interpreted as Boolean AND conditions. However, multiple entries among day-of-the-month, month, and day-of-week are interpreted as Boolean OR conditions. Finally, the use of the special character "*" in one or more columns indicates that either the command should be executed on each unit (i.e., each minute of each hour of each day of the week) or the "*" indicates a "do-not-care condition" where specific values in other columns take precedence over the "*" operator.

  13. Describe each of the following cron table entries:
    min    hr    dom    mnth  dow command
    0     0001    1      *     *  find /tmp/ \! -type d -atime -3 -exec ls -lt {} \;
    015   11      *   jan,jun mon mt -f /dev/rft0 rewind; tar -cf /dev/rft0 /src
    55    23     1,15 3,6,9,12 *  last | grep -v ptm >> /home/ptm/log
    30     2     */5    */3   0-6 sh /home/ptm/runcron

    Numbers are "free-field" (delimited with non-numeric ASCII characters). Days and months can be numeric or three letter abbreviations. The numeric range for months is 1-12, but days range 0-6. The "," enumerates multiple days or months and the "/" operator specifies a modulo days or months to perform the command.

    Row 1, at 1:00 AM on the 1st day of the month, run the a command that says locate any files below /tmp that have been accessed within the last three days. If such a file exists, execute the ls program to display the "-l" parameters of the recently accessed file. The "t" located within the "-lt" arguments to the ls program has no effect since the ls program is executed once for each file found and there is not a multi-file name list to sort by time of access.

    Row 2, at 11:15 AM, on every day of the month in January and June and on Mondays during the other 10 months of the year, rewind the tape drive in case it was not rewound, and archive the sources onto the "floppy tape" drive.

    Here we see that the "*" means do it each day when a month is specified. But when a month value is omitted, the day-of-week argument (Monday) makes the day-of-month equal to "do not care." Also note that the use of month names is referred to as the Paul Vixie extension to cron (or the vixie-cron) and, at the present time, names (as well as name lists or ranges) may not be implemented in all cron versions.

    Row 3, at 11:55 PM, on the 1st and 15th of the month, each quarter, add any account name except "ptm" who has logged on into the log file.

    Row 4, at 2:30 AM, every five days during Mar, Jun, Sep, & Dec run whatever shell scripts there are in the file "runcron."

  14. Explain how daemons are "different" from other programs and how to accommodate these differences.

    Regardless of how the daemon is executed (from the console, network terminal, or from the super server inetd) it cannot be interrupted by typical control signals such as typing ^C or logging off. Moreover, the daemon cannot report error or status information to a "default" terminal.

    By convention, a daemon can be told to re-read its configuration file(s) and restart itself with the "HUP" signal. For example if the system logging server (syslogd) PID was 2363, the command kill -HUP 2363 tells the log server to close any open log files, re-read the /etc/syslog.conf file and restart using the new parameters. Also by convention, a daemon can be terminated with the "TERM" signal. For example if the Web server (httpd) PID was 1234, the command kill -TERM 1234 tells the Web server to terminate its children, close all files and exit.

    Error reports are sent to one or more log files instead of to a "default" terminal screen. Again by convention, these files are held in the directory /var/log.

  15. Describe syslog daemon operation.

    The syslog daemon waits for network activity from the Unix domain socket /dev/log or the UDP socket on port 514 (the syslog network service). When a packet arrives, syslogd reads the message and, based upon the message type and dispersal instructions in the /etc/syslog.conf file, syslogd writes the message to the console screen or to one or more log files. Syslog message types consist of an error level (priority) and facility (security, cron, mail, etc.), the name of the program issuing the message, and the message. The date, time, and host name are also placed at the front of the message by the system logging daemon.

    Although the files can be viewed and edited, syslogd remains in the shadows since the typical user has no need to place information directly into the system log file. There is, however, a command called logger that will directly write into the system log files. To write a message from a program (or a daemon), the library services "openlog()" and "syslog()" are used.

  16. Describe the contents of the following syslog.conf file:
    *.info;*.notice  /var/log/message
    mail.debug       /var/log/maillog
    *.warn           /var/log/syslog
    kern.emerg       /dev/console

    The "dot" separates the facility on the left side from the error level on the right side. An asterisk represents any facility or level depending on which side it is placed. The semicolon (;) delimits multiple message types to be directed to the same file. On the far right hand side the name of the destination log file is specified.

  17. Explain the command kill -HUP `cat /var/run/` and contrast it with the command killall -HUP syslogd.

    First, in spite of its name, "kill" only sends signals. Only the kill signal (number 9) can guarantee process termination and that only works if the process is not in a kernel service where signals are ignored. Second, Unix daemons store "run-time" data in the /var/run directory. Files ending in ".pid" are ASCII strings of the process' PID. Third, two backwards quote marks (`) can be used so that a shell can invoke another shell to run a program and return the ASCII string result of that program.

    Therefore, the kill utility is sending a reset signal to the syslog daemon and the PID of the daemon is discovered by having a child shell read the run-time file and returning the PID string to the shell running the kill utility.

    Linux has adapted killall from System V Unix so that it sends the specified signal to all processes that match the specified name. In the present example, kill and killall have the same effect for "single instance daemons" such as the system logging daemon but killall may have unexpected results for multi-process daemons such as the Web server daemon (httpd).

  18. Define "log rotation" and describe the command logrotate.

    Log rotation is the idea of aging log files so that after they are no longer needed for system administration they can be deleted or e-mailed elsewhere for archiving. Active (large) log files may be rotated each day or every few days. Less active files are rotated one per month.

    With each rotation, the oldest file is removed or e-mailed. Log files are shuffled so that the next-to-oldest is moved from file.log.n to file.log.n+1. When the youngest log file is reached, it is usually compressed and moved the open slot created by the earlier file shuffling.

    The Linux command logrotate is activated by the cron daemon (e.g., /etc/cron.daily/logrotate). Upon startup logrotate reads its configuration file (/etc/logrotate.conf) and performs rotation for all file types described. Upon completion logrotate writes its status information file to /var/lib/logrotate.status.

  19. Describe the log rotation configuration files and explain the following example:
    /var/log/httpd/access.log {
        rotate 5
             /sbin/kill -HUP `cat /var/run/`

    Since there are many types of log files, the /etc/logrotate.conf file generally the contains the directive include /etc/logrotate.d that tells the logrotate command to process a directory of additional configuration files. The files in logrotate.d directory describe how to process cron, Web, FTP, and system log files.

    In the above example, a series of directives are delimited with the open and close brace characters "{}" and associated with the log file /var/log/httpd/access.log. The file is to be rotated five times. Rotation frequency and use of compression are determined by global definitions found in the /etc/logrotate.conf file. The old log file and any error reports are to be e-mailed. If the current log file grows larger than 100 KB, then perform rotation without waiting for the usual rotation time. Finally, execute the killall command to reset the daemon after log file rotation.

  20. Explain the role of non-ASCII /var/log files such as wtmp, lastlog, and utmp.

    These binary data files are used by user accounting programs. The /var/log/wtmp file accumulates entries for each login session (i.e., the entry is not complete until the user logs off the system). The /var/log/wtmp file is read by the last command which displays the login time and duration of most recent to least recent users. The /var/log/lastlog file records only the most recent login for each user. The commands who, w, and finger access the /var/log/utmp file to show the currently logged in users. The finger command also gets its information on home phone number, login shell, mail status from the /etc/passwd file. The finger command also displays the contents of the .plan, .project, and .forward files from the user's home directory.

  21. Name and describe seven types of backup program suites.

  22. Define and give the design tradeoff of "RAID." Contrast disk interleave with RAID striping

    RAID is an acronym for Redundant Array of Inexpensive Disks and it originates from a time when price breaks existed between large, fast 14 inch drives and small, slow 5 inch drives. RAID offers the cost of multiple drives in exchange for fast simultaneous access and the possibility of redundant (reliable) information stored across the multiple drives (stable storage).

    Physical sectors are number sequentially around the hard disk platter. Interleave is the idea of creating logical sectors that skip one or more physical sectors, thereby giving the read/write circuitry enough time to reset so that sequential sectors may be read without having to wait for the beginning of the next sequential sector to spin around and travel under the read/write head again.

    RAID striping is the same idea as interleave, except that it occurs across hard disks. Striping writes sequential sectors (or cluster of sectors) to each hard disk drive. When the last drive is reached, the next sector(s) are written to the first hard disk drive and the process repeats.

  23. Define "RAID-0" and explain how it implements fast access, yet increases the probability of disk error.

    In type 0 RAID, multiple hard disks are synchronized so that all drives have the next sequential sector under the read/write heads at the same time. Assuming a five-disk array, five sequential records can be written or read in the same interval as one single-drive record.

    Eventually all disk drives will fail. The manufacturer assists with determining this eventuality by providing a Mean Time Between Failure (MTBF) estimate in hours. A typical value is 250,000 hours (or 28.5 years). Each drive represents one data point in the distribution of time-to-failure intervals and it is this distribution that is used to calculate the MTBF for all drives. As more drives for the RAID sub-system are randomly selected from the drive faliure distribution, the more likely it is that a "short life" drive would be selected.

  24. Define "RAID-1," explain how it implements stable storage and possibly fast access, and how it conflicts with cache.

    The RAID-1 design duplicates data on each drive in the array. Usually there are just two RAID-1 drives and if one fails, it is replaced and updated before the second drive has a chance to fail.

    If there are more than two drives and they are synchronized like RAID-0, then read (but not write) access speed increases as sequential records can be read from each drive. Write access must write the same record to all drives.

    Note that if reliability is desired, then caches located on the disk controller and on the disk drive must be set "write through" or disabled, since a power failure may leave the disk drives with inconsistent information.

  25. Define "RAID-3" and explain how it implements a much smaller (compressed) stable storage footprint.

    RAID-3 exploits the XOR Boolean operation that creates extra "information" bits that will reconstruct missing information. As an exercise, write any three binary bytes on a piece of paper. Create a fourth byte using the XOR of the first three bytes. Now cross off any one of the four bytes and XOR the remaining three bytes. The result will be the byte that was crossed off.

    Regardless of the number of disk drives employed, the RAID-3 XOR information is held on an extra hard disk and when one of the drives fail, the new disk constructed from the XOR of the other drives. In this way, fast access may be achieved through striping and reliability achieved through the XOR disk.

    Again, caches located on the disk controller and on the disk drive must be set "write through" or disabled, since a power failure may leave the disk drives with inconsistent information.

  26. Define "RAID-5" and explain how it is a compromise between slow compressed stable storage and fast access.

    Although RAID-3 allows fast read access, it suffers from a write access bottleneck. Specifically, the XOR information must be tabulated and written to the same hard disk while the data maybe striped across the other drives.

    RAID-5 improves upon the RAID-3 design by incrementing the XOR start disk number (for each write operation) and striping the XOR information and data so that it is spread evenly among all disks in the array. Even though RAID-5 suffers from XOR tabulation overhead, it may read and write in parallel.

    Again, caches located on the disk controller and on the disk drive must be set "write through" or disabled, since a power failure may leave the disk drives with inconsistent information.

  27. Describe the overall printing system in Linux (BSD Unix).

    BSD Unix printing is implemented as a client/server model and predicated on network sockets even when the printing is done locally. The client is called line printer request or lpr and the server is referred to as the line printer daemon or lpd. The client lpr program reads the files to be printed from the users' area and writes them into a spooling directory named after the printer. A control file describing the files to be printed is also written into the directory by lpr. The client lpr program sends a "notification" message through the Unix domain socket to the lpd server and terminates.

    The lpd server takes over and reads its configuration file searching for valid printers and valid control files in the directories associated with the printers. If the printer is local, the server sends the optional header identification page and the data file to the printing device. If the printer is remote, the server establishes a network connection with its peer server on the remote host and sends the control and data files to the peer. The peer server now processes the files in the same way as the local lpd would have.

  28. Explain the operation of the following commands: lptest 80 120 > /dev/lp1 and cat > /dev/ttys0.

    Lptest generates a "ripple test pattern" of 80 characters by 120 rows. This is sent to the device attached to a parallel port. Assuming that an attached printer does \n to \r\n translation, this will print a pretty ripple test pattern.

    The command cat > /dev/ttys0 will copy the (presumably postscript) contents of the file out through a serial port. Again, assuming that the port has been configured with the proper baud and data settings and that the attached printer can interpret postscript, it will render the page described in the file.

  29. Explain the role of the /etc/printcap file and explain the meaning of these entries:
    lp|LA-180 DecWriter III:\
    lj4|HP LaserJet 4 Plus Printer:\

    The printcap file describes the set of printers "reachable" from the current host. Each line describes a printer capability. The line begins with a label for the printer. Subsequent fields are delimited with the ":" character and the "\" followed by a new line indicates that the next line is to be joined with the previous when the printcap file is read by lpd.

    The first line describes "lp" as an old dot matrix printer that runs at 1200 Baud [br], flags are to be set [fs] (octal bit pattern) in the serial I/O register of the printer, the printer needs a trailer [tr] string, a form-feed character, so that paper can be ripped off from the printer, before printing invoke the input filter [if] program "lpf" to paginate the output, and finally, put error messages into the file /var/log/lpd-errors.

    The second line describes "lj4" as a stand alone printer directly connected to the network and the remote machine [rm] is "lj4" (this could be an IP address). The spool directory is also lj4 and error messages are to be place in the file /var/log/lpd-errors.

  30. Define "filter," explain its role in Unix printing, and give examples.

    A filter reads from standard in, transforms input based upon an internal algorithm and places the result on standard out.

    Printers usually require ASCII or graphic information be converted to a printer-specific format before accepting print data. The role of the Linux (Unix) printer filters is to convert standard OS formats, to format unformatted text, or to convert among OS, industry, or printer formats into a printer-specific format.

    1. The /usr/lib/lpf filter is designed for ASCII text reformatting. For example, it converts the \n Unix EOL into the two character \r\n MS-DOS EOL for a printer.
    2. The /usr/bin/nenscript filter reads ASCII files and converts to PostScript (which does not seem like much since most PostScript printers accept ASCII text directly). But nenscript will also reformat the ASCII text with a smaller font and multiple columns that puts two pages to one physical landscape page.
    3. There are also "magic" filters that convert format based upon file content. For example automatic detection of DVI format and subsequent conversion to to PostScript.

  31. Describe how to set up a new printer.

    1. Edit the /etc/printcap file, copy an existing printer entry and give the new printer a "logical name."
    2. Modify the new printer printcap parameters to match the new printer.
    3. Add the proper printer input filter if required.
    4. Use the logical name to name a new printer directory under the /var/spool/lpd directory.

  32. Contrast the commands lptest > /dev/lp1 versus lptune /dev/lp1 -i7.

    The lptest program builds and sends a "worst case" ASCII character ripple pattern to its output which is redirected directly to a line printer driver. This will determine if the printer can be activated by the driver and, if so, lptest will reveal the print quality.

    The lptune program allows one to adjust the type of communication and timing between the driver and printer. The above command switches from polling the device to allowing the device to use interrupt 7 to signal completion of operation.

  33. Explain the differences among the lpc, lpq, and lprm commands.

    The line printer control, lpc, command leads to an interactive environment that allows the operator (superuser) to determine the status of one or more printers. Assuming something is wrong and the operator is the superuser, print jobs can be stopped, started, cleaned, enabled, or disabled.

    The line printer queue, lpq -Pprinter, command displays the outstanding print requests for a given printer.

    The line printer remove, lprm, command removes a pending print request from the queue.

  34. Describe how to use the lpc command in stand-alone mode.

    Any user can use the lpc command to view the status of a print queue with the command lpc status lp where "lp" is the name of the print queue. But, only the superuser can use the lpc command to modify a print queue. Two frequently used commands are stop or abort the current print queue and then start it up again. Thus, the command sequence "lpc abort lp; lpc start lp" is frequently used to reset a hung printer daemon.

    Also, if the printer is defined as a shell variable, the command sequence "lpc abort $PRINTER; lpc start $PRINTER" can reset the default printer on a per-user basis.

  35. Describe how to troubleshoot an unresponsive printer.

    1. Check the "/var/spool/lpd/printer/status" file and the "/var/log/lpd-errs" file for error messages.
    2. Check for a "/var/spool/lpd/printer/lock" file and remove if necessary.
    3. Check the "/var/spool/lpd/printer/" directory for control and data files. There should be at least one "cf" control file and one "df" data file.
    4. Check the "/etc/printcap" entry to ensure it has not been modified.
    5. Test the printer directly with the command: lptest > /dev/lp1.
    6. Check to ensure that the printer can accept the format with the command: cat file > /dev/lp1.
    7. If a serial printer, check to see if UUCP has taken over ownership of the serial port.
    8. Check to see if shell environmental variables are redirecting program output to another logical printer.

  36. Contrast the stty, setterm, and dircolors commands and explain how they are used.

    The stty command sets or clears the traditional Unix terminal characteristics such as Baud, display size, single character command definitions, and serial line datalink definitions.

    The Linux setterm command changes the visual characteristics of a terminal such as background/foreground colors.

    The Linux dircolors command changes the visual characteristics of the ls program. But it must be combined with shell variables and aliases. Only the console terminal interprets the color attributes and the attributes are lost if the output of ls is passed to another program such as more. Here is a bash example.

    export LS_OPTIONS='--color=auto'
    eval `dircolors`
    alias ls='ls $LS_OPTIONS'
    alias ll='ls $LS_OPTIONS -l'
    alias l='ls $LS_OPTIONS -lA'

  37. When you find yourself locked out from the OS, explain how to troubleshoot MBR, password, and file system problems.

    1. Before any problems occur, make a set of "rescue" disks that will boot a kernel image from the floppy disk.
    2. Have the rescue disk load a ram disk from a second floppy containing the utilities such as vi, fdisk, e2fsck, mount, tar, and gzip.
    3. Have the kernel start off bash instead of init.
    4. Examine and modify partitions as required. Be sure to have a saved or printed copy of the old partition table values.
    5. Use e2fsck to fix file the file system. Remember, if the superblock is gone, fetch a duplicate with the e2fsck -f -b /dev/hda1 command.

    Alternatively, original install media (floppy or CD-ROM) can frequently be used if one did not have the foresight to make a rescue disk. For example, with the Red Hat Linux distribution, you could begin an update or an install and, after the install process has displayed "entering second stage install," a switch can be made to the second virtual console which provides a root shell prompt from which corrections can be made.

    Original by P. Tobin Maginnis - October 1998
    Critical review by William C. DenBesten - December 1998