Simplicity offers ease-of-use but limited functionality. For simplicity, Linux offers prepackaged distributions like Ubuntu that make installing Linux very easy. But if additional functionality is required, the user can progress through deeper and deeper layers of control until the desired element is found and integrated. Six examples of layers of control are:
Gzip automatically removes the uncompressed file and leaves in its place the compressed version. The compressed version includes the original uncompressed file name & its original size.
Gunzip uncompresses the file, replacing the compressed version with the uncompressed version. Zcat reads the uncompressed version without replacing the compressed version.
An archiver is an essential operating system utility that gathers or scatters many smaller files to/from one larger file. The archiver must also preserve critical system information such as who owns the files, the permission settings, the directory path, as well as hard and symbolic links.
By default, tar reads from the Unix "raw" tape drive "/dev/rmt0" (raw devices are obsolete in Linux) and the "f" argument redirects the archive to the file name following the argument.
The "-" is used as a file name in command line arguments but it is not a file. Instead, it's a Unix convention that is required when a system utility read/writes to more than one file and the user wants to direct the "other" file to/from stdin or stdout.
In the case of tar, it will accept a series of file names to archive from stdin with redirection or dump files to stdout. But if the archive itself is to come from or go to stdin or stdout, then using the dash file "f -" will accomplish the goal.
In the first command, tar reads and archives "files" to stdout listing each file name as it is archived. The output is piped to gzip which compresses the archive (optimizing for space at the cost of compression time), and the output is then redirected to the file "archive.tar.gz."
The second command uses the "z" option of GNU tar and compresses (with a default "-6" time/space tradeoff) the archive directly. The ".tz" extension is just as meaningful as ".tar.gz" and tends to be used more often.
The mv command does not copy and delete files. It simply "adjusts" inode numbers in the parent directories to simulate the move operation. But when the move crosses a volume boundary, mv generates an error message since inode numbers in one volume cannot reference inodes in another volume. To overcome these complexities, one uses a combination of tar and shell programming to physically copy the files.
tar cf - . | (cd ../destination; tar xvf -).
In the above display, the tar command archives the current directory contents and pipes the (unnamed) archive to another shell. The second shell changes directory to the "destination" directory with one command and the second command employs tar again to unarchive the unnamed archive.
Support: distributions like Red Hat's have become very easy to run. Red Hat 5.1 installs "everything" off a CD-ROM in about 12 minutes! Just tar up those key configuration files in /etc, your directory, your mail files, and your Web pages, then reload them and you are ready to go again!
Refute: Over time you forget how much customization is done. You forgot about that bug fix in sendmail, the extra features you configured in pine, the extra options configured for vim, etc. Installing a new distribution erases all of these features and you must begin again learning the quirks of the new versions.
Programs only see virtual memory that is mapped onto physical memory by the memory management unit (MMU) hardware. The MMU divides virtual memory into pages. When a program tries to access virtual memory that is not mapped to physical memory by the MMU, an interrupt occurs and causes an unused physical page to be moved or "swapped" to the hard disk. The new page is read from the file system and loaded into memory. Since pages are logically sequential, two or more programs may see a pure or read-only page logically in the program; but in fact, the same physical page is being "shared" with the other programs. Generally, shared memory is frequently accessed and not swapped.
A "static" linked library module is attached to the program at compile-link time and the program does not share the library module with other programs. Since the library module is physically attached to the program it is always there when the program invokes a function in the module.
In the case of a "shared" library module, only the name of the module to be attached and a loader (i.e., ld.so) are static linked into the existing program. When the first program that has a given module name executes, it loads the shared library module. The module loads on a page boundary and, in this way, the same library module page can be shared (through the MMU) by many programs or shared by multiple copies of the same program.
A "dynamic" library module is similar to a shared module except that the attached loader (i.e., ld-linux.so) waits for the program to explicitly call for loading the library module. Put another way, the dynamic loader allows the program to decide moment by moment how many library modules it wishes to load or unload.
In summary, compile-time static linked programs waste memory but always run. Shared library modules save overall system space and run-time dynamic linked programs use memory extremely efficiently, but if the library module is missing (or the wrong version), the program will not run.
As an aside, this is a far cry from the PC development environments (MS-C, MS-VB, and Borland C++) that staticly attach the whole library the program even though only module in the library was called.
Efficient memory usage (i.e., a small process footprint) and the use of a large data cache are fundamental reasons why Linux performs as well as it does; however, this speed is traded for additional system management requirements.
The first constraint is that the attached loader (ld-linux.so) has only the name of the dynamically linked module and it must be able to find the correct module for the program to continue executing. The second constraint is that the shared object name (soname) and version number of the dynamically linked module must also match the same name and version number in the library.
To assist in finding dynamic library modules quickly, the system administrator uses the program ldconfig to read the /etc/ld.so.config file and set up a list of links to library locations and names for ld-linux.so.
An object module (also called a relocatable object module) is an unfinished program. Although its high-level source code has been compiled into assembly language and its assembly language has been assembled into ones and zeros, it has not been joined with other modules and it has not been assigned memory addresses. These object modules could be in one (or all) of the three library formats: static, shared, or dynamic, and each library format can have many libraries.
Static libraries employ an archiver. The archiver, ar, works in the same way as tar but it combines object module names, additional "entry points," and a table of contents into the archive. The linking loader (also called a linking editor) searches the table of contents for a module name or entry point and, when found, joins (edits) the library module with the program. The static library is usually housed in /usr/lib and takes the form /usr/lib/libxxx.a where xxx is the library name and ".a" indicates the archive format.
Linux libraries are moving from static to dynamic and the "less efficient" shared library archive format ".sa" is no longer used.
As of now, Linux dynamic libraries do not have an archiver; instead, modules are included in the library by the GNU compiler and loader. The dynamic library is usually housed in /lib and takes the form /lib/libxxx.so.version where xxx is the library name, ".so" indicates the shared object Executable and Linking Format (ELF) format, and ".version" is the major version number.
The loader that attaches to the program (ld-linux.so) only looks for the major version number, but libraries come with a major, minor, and patch level number to keep track of versions. The symbolic link redirects references to the most recent library version without the need to recompile the programs that call the new library version.
-rwxr-xr-x 1 root root 651112 Jul 16 20:38 /lib/libc-2.0.7.so lrwxrwxrwx 1 root root 14 Sep 8 12:07 /lib/libc.so.5 -> libc.so.5.4.38 -rwxr-xr-x 1 root root 584776 Jun 7 07:09 /lib/libc.so.5.4.38 lrwxrwxrwx 1 root root 13 Sep 4 10:46 /lib/libc.so.6 -> libc-2.0.7.soFor example, in the above display we see that the shared object library has two versions of libc, versions 5 & 6. Version 5 is really "libc.so.5.4.38" and compatible with older program binaries while newer programs use version 6 which points to the real library "libc-2.0.7.so."
Thus, the symbolic links refer the shared object name (soname) to the current version of the library and these links are built by ldconfig at system boot time. The ldconfig reads the header and file names of libraries located in /lib, /usr/lib, as well as other directories listed in the /etc/ld.so.conf configuration file. When ldconfig encounters libraries with new versions, it updates the soname symbolic link to the new version. The program ldd examines programs and reports which dynamic linked soname libraries are required for the program to execute.
Normally, one would rm the old symbolic links and use the link program, ln, to create a new link. But the ln program may use the /lib/libc.so.6 dynamic link library and, therefore, will not run once the old symbolic link is deleted. There are three solutions to this dilemma:
Even for a non-programmer, re-compilation is not the forbidding and complex task that it appears to be. Most of the details are handled through batch control files called "Makefiles." Second, re-compilation provides complete mastery over how the Linux system is to be configured. Unneeded components can be removed to conserve memory usage while unique file managers or various options can be enabled to improve functionality and performance for your particular system. In other words, re-compilation provides complete control over configuration of the operating system.
Software versions come in dotted decimal numbers like 5.0.12. The "5" is the "major" number and relates to compatibility among a large group of modules. The "0" is the "minor" number and indicates that some of the modules have been modified, but not in a way that makes them incompatible with the group as a whole. The "12" refers to the "patch" level where minor corrections to one or more modules was made.
By convention, versions with even-numbered "minor" numbers have been tested and are said to be stable, while odd-numbered minor numbers are said to be experimental.
New Linux kernels are downloaded from The Linux Kernel Archives at http://www.kernel.org/. However the most recent versions are only the source code "differences" among an earlier patch version and subsequent patch versions. To construct the current source code version of the kernel, one starts with the latest kernel version (usually in /usr/src/linux) and runs the patch program with next patch version as input The patch program reads file names and "diffs" and searches for the same file names. When a file is found, old source code is removed and new lines are added. Since one patch version is unaware of another, the various patch input files must be applied in strict sequence to the results of the earlier pass. When the process is complete, the kernel source will be up-to-date.
Errors in the update process can be detected by searching for files created with the suffix of ".rej" or "#".
As an aside, the original PDP-11 Unix kernel images were simply called "unix." When Unix was ported to the DEC Virtual Address Extended (VAX) architecture with its 4 GB virtual memory space, paging was added to the kernel (separate from the original swapping code) and the name "vmunix" was born.
System administrators probably dislike nothing more than reconfiguration to accommodate additional peripherals. For example, Microsoft's plug and play in combination with Windows 9x "sniffing" to detect "lost" or new hardware creates complex interactions which may take days to resolve. Traditional Unix implementations are not any better since they require kernel re-compilation to accommodate new hardware.
Linux takes a different approach to the problem with the concept of loadable device drivers. The insmod program loads a device driver from the /lib/modules directory. Upon loading, the driver attempts to initialize the hardware. If successful, the driver remains in the kernel, otherwise it terminates with an error message. The lsmod program lists currently loaded drivers and the rmmod program removes a driver from the kernel.
These functions could not be carried out by existing utilities since they execute as user-level programs. Device drivers must be in the same kernel "address space" so that they can have direct access to kernel data structures.
Loadable device drivers suffer a small run-time penalty since they are reached indirectly though a lookup table. However, any driver can be included directly in the kernel at compile time to improve run-time performance.
Copyright © 1998 Tobin Maginnis
This document is free; you can redistribute it and/or modify it under the terms
of the GNU General Public License as published by the Free Software Foundation.