2. Orientation in the file system

2.1. The path

When you want the system to execute a command, you almost never have to give the full path to that command. For example, we know that the ls command is in the /bin directory (check with which -a ls), yet we don't have to enter the command /bin/ls for the computer to list the content of the current directory.

The PATH environment variable takes care of this. This variable lists those directories in the system where executable files can be found, and thus saves the user a lot of typing and memorizing locations of commands. So the path naturally contains a lot of directories containing bin somewhere in their names, as the user below demonstrates. The echo command is used to display the content ($) of the variable PATH:

rogier:> echo $PATH
/opt/local/bin:/usr/X11R6/bin:/usr/bin:/usr/sbin/:/bin

In this example, the directories /opt/local/bin, /usr/X11R6/bin, /usr/bin, /usr/sbin and /bin are subsequently searched for the required program. As soon as a match is found, the search is stopped, even if not every directory in the path has been searched. This can lead to strange situations. In the first example below, the user knows there is a program called sendsms to send an SMS message, and another user on the same system can use it, but she can't. The difference is in the configuration of the PATH variable:

[jenny@blob jenny]$ sendsms
bash: sendsms: command not found
[jenny@blob jenny]$ echo $PATH
/bin:/usr/bin:/usr/bin/X11:/usr/X11R6/bin:/home/jenny/bin
[jenny@blob jenny]$ su - tony
Password:
tony:~>which sendsms
sendsms is /usr/local/bin/sendsms

tony:~>echo $PATH
/home/tony/bin.Linux:/home/tony/bin:/usr/local/bin:/usr/local/sbin:\
/usr/X11R6/bin:/usr/bin:/usr/sbin:/bin:/sbin

Note the use of the su (switch user) facility, which allows you to run a shell in the environment of another user, on the condition that you know the user's password.

A backslash indicates the continuation of a line on the next, without an Enter separating one line from the other.

In the next example, a user wants to call on the wc (word count) command to check the number of lines in a file, but nothing happens and he has to break off his action using the Ctrl+C combination:

jumper:~> wc -l test

(Ctrl-C)
jumper:~> which wc
wc is hashed (/home/jumper/bin/wc)

jumper:~> echo $PATH
/home/jumper/bin:/usr/local/bin:/usr/local/sbin:/usr/X11R6/bin:\
/usr/bin:/usr/sbin:/bin:/sbin

The use of the which command shows us that this user has a bin-directory in his home directory, containing a program that is also called wc. Since the program in his home directory is found first when searching the paths upon a call for wc, this home-made program is executed, with input it probably doesn't understand, so we have to stop it. To resolve this problem there are several ways (there are always several ways to solve a problem in UNIX/Linux): one answer could be to rename the user's wc program, or the user can give the full path to the exact command he wants, which can be found by using the -a option to the which command.

If the user uses programs in the other directories more frequently, he can change his path to look in his own directories last:

jumper:~> export PATH=/usr/local/bin:/usr/local/sbin:/usr/X11R6/bin:\
/usr/bin:/usr/sbin:/bin:/sbin:/home/jumper/bin

Changes are not permanent!

Note that when using the export command in a shell, the changes are temporary and only valid for this session (until you log out). Opening new sessions, even while the current one is still running, will not result in a new path in the new session. We will see in Section 2, “Your text environment” how we can make these kinds of changes to the environment permanent, adding these lines to the shell configuration files.

2.2. Absolute and relative paths

A path, which is the way you need to follow in the tree structure to reach a given file, can be described as starting from the trunk of the tree (the / or root directory). In that case, the path starts with a slash and is called an absolute path, since there can be no mistake: only one file on the system can comply.

In the other case, the path doesn't start with a slash and confusion is possible between ~/bin/wc (in the user's home directory) and bin/wc in /usr, from the previous example. Paths that don't start with a slash are always relative.

In relative paths we also use the . and .. indications for the current and the parent directory. A couple of practical examples:

  • When you want to compile source code, the installation documentation often instructs you to run the command ./configure, which runs the configure program located in the current directory (that came with the new code), as opposed to running another configure program elsewhere on the system.

  • In HTML files, relative paths are often used to make a set of pages easily movable to another place:

    <img alt="Garden with trees" src="../images/garden.jpg">
    
  • Notice the difference one more time:

    theo:~> ls /mp3
    ls: /mp3: No such file or directory
    theo:~>ls mp3/
    oriental/  pop/  sixties/
    

2.3. The most important files and directories

2.3.1. The kernel

The kernel is the heart of the system. It manages the communication between the underlying hardware and the peripherals. The kernel also makes sure that processes and daemons (server processes) are started and stopped at the exact right times. The kernel has a lot of other important tasks, so many that there is a special kernel-development mailing list on this subject only, where huge amounts of information are shared. It would lead us too far to discuss the kernel in detail. For now it suffices to know that the kernel is the most important file on the system.

2.3.2. The shell

2.3.2.1. What is a shell?

When I was looking for an appropriate explanation on the concept of a shell, it gave me more trouble than I expected. All kinds of definitions are available, ranging from the simple comparison that the shell is the steering wheel of the car, to the vague definition in the Bash manual which says that bash is an sh-compatible command language interpreter, or an even more obscure expression, a shell manages the interaction between the system and its users. A shell is much more than that.

A shell can best be compared with a way of talking to the computer, a language. Most users do know that other language, the point-and-click language of the desktop. But in that language the computer is leading the conversation, while the user has the passive role of picking tasks from the ones presented. It is very difficult for a programmer to include all options and possible uses of a command in the GUI-format. Thus, GUIs are almost always less capable than the command or commands that form the backend.

The shell, on the other hand, is an advanced way of communicating with the system, because it allows for two-way conversation and taking initiative. Both partners in the communication are equal, so new ideas can be tested. The shell allows the user to handle a system in a very flexible way. An additional asset is that the shell allows for task automation.

2.3.2.2. Shell types

Just like people know different languages and dialects, the computer knows different shell types:

  • sh or Bourne Shell: the original shell still used on UNIX systems and in UNIX related environments. This is the basic shell, a small program with few features. When in POSIX-compatible mode, bash will emulate this shell.

  • bash or Bourne Again SHell: the standard GNU shell, intuitive and flexible. Probably most advisable for beginning users while being at the same time a powerful tool for the advanced and professional user. On Linux, bash is the standard shell for common users. This shell is a so-called superset of the Bourne shell, a set of add-ons and plug-ins. This means that the Bourne Again SHell is compatible with the Bourne shell: commands that work in sh, also work in bash. However, the reverse is not always the case. All examples and exercises in this book use bash.

  • csh or C Shell: the syntax of this shell resembles that of the C programming language. Sometimes asked for by programmers.

  • tcsh or Turbo C Shell: a superset of the common C Shell, enhancing user-friendliness and speed.

  • ksh or the Korn shell: sometimes appreciated by people with a UNIX background. A superset of the Bourne shell; with standard configuration a nightmare for beginning users.

The file /etc/shells gives an overview of known shells on a Linux system:

mia:~> cat /etc/shells
/bin/bash
/bin/sh
/bin/tcsh
/bin/csh

Fake Bourne shell

Note that /bin/sh is usually a link to Bash, which will execute in Bourne shell compatible mode when called on this way.

Your default shell is set in the /etc/passwd file, like this line for user mia:

mia:L2NOfqdlPrHwE:504:504:Mia Maya:/home/mia:/bin/bash

To switch from one shell to another, just enter the name of the new shell in the active terminal. The system finds the directory where the name occurs using the PATH settings, and since a shell is an executable file (program), the current shell activates it and it gets executed. A new prompt is usually shown, because each shell has its typical appearance:

mia:~> tcsh
[mia@post21 ~]$
2.3.2.3. Which shell am I using?

If you don't know which shell you are using, either check the line for your account in /etc/passwd or type the command

echo $SHELL

2.3.3. Your home directory

Your home directory is your default destination when connecting to the system. In most cases it is a subdirectory of /home, though this may vary. Your home directory may be located on the hard disk of a remote file server; in that case your home directory may be found in /nethome/your_user_name. In another case the system administrator may have opted for a less comprehensible layout and your home directory may be on /disk6/HU/07/jgillard.

Whatever the path to your home directory, you don't have to worry too much about it. The correct path to your home directory is stored in the HOME environment variable, in case some program needs it. With the echo command you can display the content of this variable:

orlando:~> echo $HOME
/nethome/orlando

You can do whatever you like in your home directory. You can put as many files in as many directories as you want, although the total amount of data and files is naturally limited because of the hardware and size of the partitions, and sometimes because the system administrator has applied a quota system. Limiting disk usage was common practice when hard disk space was still expensive. Nowadays, limits are almost exclusively applied in large environments. You can see for yourself if a limit is set using the quota command:

pierre@lamaison:/> quota -v
Diskquotas for user pierre (uid 501): none

In case quotas have been set, you get a list of the limited partitions and their specific limitations. Exceeding the limits may be tolerated during a grace period with fewer or no restrictions at all. Detailed information can be found using the info quota or man quota commands.

No Quota?

If your system can not find the quota, then no limitation of file system usage is being applied.

Your home directory is indicated by a tilde (~), shorthand for /path_to_home/user_name. This same path is stored in the HOME variable, so you don't have to do anything to activate it. A simple application: switch from /var/music/albums/arno/2001 to images in your home directory using one elegant command:

rom:/var/music/albums/arno/2001> cd ~/images

rom:~/images> pwd
/home/rom/images

Later in this chapter we will talk about the commands for managing files and directories in order to keep your home directory tidy.

2.4. The most important configuration files

As we mentioned before, most configuration files are stored in the /etc directory. Content can be viewed using the cat command, which sends text files to the standard output (usually your monitor). The syntax is straight forward:

cat file1 file2 ... fileN

In this section we try to give an overview of the most common configuration files. This is certainly not a complete list. Adding extra packages may also add extra configuration files in /etc. When reading the configuration files, you will find that they are usually quite well commented and self-explanatory. Some files also have man pages which contain extra documentation, such as man group.

Table 3.3. Most common configuration files

FileInformation/service
aliases Mail aliases file for use with the Sendmail and Postfix mail server. Running a mail server on each and every system has long been common use in the UNIX world, and almost every Linux distribution still comes with a Sendmail package. In this file local user names are matched with real names as they occur in E-mail addresses, or with other local addresses.
apache Config files for the Apache web server.
bashrc The system-wide configuration file for the Bourne Again SHell. Defines functions and aliases for all users. Other shells may have their own system-wide config files, like cshrc.
crontab and the cron.* directories Configuration of tasks that need to be executed periodically - backups, updates of the system databases, cleaning of the system, rotating logs etc.
default Default options for certain commands, such as useradd.
filesystems Known file systems: ext3, vfat, iso9660 etc.
fstab Lists partitions and their mount points.
ftp* Configuration of the ftp-server: who can connect, what parts of the system are accessible etc.
group Configuration file for user groups. Use the shadow utilities groupadd, groupmod and groupdel to edit this file. Edit manually only if you really know what you are doing.
hosts A list of machines that can be contacted using the network, but without the need for a domain name service. This has nothing to do with the system's network configuration, which is done in /etc/sysconfig.
inittab Information for booting: mode, number of text consoles etc.
issue Information about the distribution (release version and/or kernel info).
ld.so.conf Locations of library files.
lilo.conf, silo.conf, aboot.conf etc. Boot information for the LInux LOader, the system for booting that is now gradually being replaced with GRUB.
logrotate.* Rotation of the logs, a system preventing the collection of huge amounts of log files.
mail Directory containing instructions for the behavior of the mail server.
modules.conf Configuration of modules that enable special features (drivers).
motd Message Of The Day: Shown to everyone who connects to the system (in text mode), may be used by the system admin to announce system services/maintenance etc.
mtab Currently mounted file systems. It is advised to never edit this file.
nsswitch.conf Order in which to contact the name resolvers when a process demands resolving of a host name.
pam.d Configuration of authentication modules.
passwd Lists local users. Use the shadow utilities useradd, usermod and userdel to edit this file. Edit manually only when you really know what you are doing.
printcap Outdated but still frequently used printer configuration file. Don't edit this manually unless you really know what you are doing.
profile System wide configuration of the shell environment: variables, default properties of new files, limitation of resources etc.
rc* Directories defining active services for each run level.
resolv.conf Order in which to contact DNS servers (Domain Name Servers only).
sendmail.cf Main config file for the Sendmail server.
services Connections accepted by this machine (open ports).
sndconfig or sound Configuration of the sound card and sound events.
ssh Directory containing the config files for secure shell client and server.
sysconfig Directory containing the system configuration files: mouse, keyboard, network, desktop, system clock, power management etc. (specific to RedHat)
X11 Settings for the graphical server, X. RedHat uses XFree, which is reflected in the name of the main configuration file, XFree86Config. Also contains the general directions for the window managers available on the system, for example gdm, fvwm, twm, etc.
xinetd.* or inetd.conf Configuration files for Internet services that are run from the system's (extended) Internet services daemon (servers that don't run an independent daemon).

Throughout this guide we will learn more about these files and study some of them in detail.

2.5. The most common devices

Devices, generally every peripheral attachment of a PC that is not the CPU itself, is presented to the system as an entry in the /dev directory. One of the advantages of this UNIX-way of handling devices is that neither the user nor the system has to worry much about the specification of devices.

Users that are new to Linux or UNIX in general are often overwhelmed by the amount of new names and concepts they have to learn. That is why a list of common devices is included in this introduction.

Table 3.4. Common devices

NameDevice
cdromCD drive
consoleSpecial entry for the currently used console.
cua*Serial ports
dsp*Devices for sampling and recording
fd*Entries for most kinds of floppy drives, the default is /dev/fd0, a floppy drive for 1.44 MB floppies.
hd[a-t][1-16]Standard support for IDE drives with maximum amount of partitions each.
ir*Infrared devices
isdn*Management of ISDN connections
js*Joystick(s)
lp*Printers
memMemory
midi*midi player
mixer* and musicIdealized model of a mixer (combines or adds signals)
modemModem
mouse (also msmouse, logimouse, psmouse, input/mice, psaux)All kinds of mouses
nullBottomless garbage can
par*Entries for parallel port support
pty*Pseudo terminals
radio*For Radio Amateurs (HAMs).
ram*boot device
sd*SCSI disks with their partitions
sequencerFor audio applications using the synthesizer features of the sound card (MIDI-device controller)
tty*Virtual consoles simulating vt100 terminals.
usb*USB card and scanner
video*For use with a graphics card supporting video.

2.6. The most common variable files

In the /var directory we find a set of directories for storing specific non-constant data (as opposed to the ls program or the system configuration files, which change relatively infrequently or never at all). All files that change frequently, such as log files, mailboxes, lock files, spoolers etc. are kept in a subdirectory of /var.

As a security measure these files are usually kept in separate parts from the main system files, so we can keep a close eye on them and set stricter permissions where necessary. A lot of these files also need more permissions than usual, like /var/tmp, which needs to be writable for everyone. A lot of user activity might be expected here, which might even be generated by anonymous Internet users connected to your system. This is one reason why the /var directory, including all its subdirectories, is usually on a separate partition. This way, there is for instance no risk that a mail bomb, for instance, fills up the rest of the file system, containing more important data such as your programs and configuration files.

/var/tmp and /tmp

Files in /tmp can be deleted without notice, by regular system tasks or because of a system reboot. On some (customized) systems, also /var/tmp might behave unpredictably. Nevertheless, since this is not the case by default, we advise to use the /var/tmp directory for saving temporary files. When in doubt, check with your system administrator. If you manage your own system, you can be reasonably sure that this is a safe place if you did not consciously change settings on /var/tmp (as root, a normal user can not do this).

Whatever you do, try to stick to the privileges granted to a normal user - don't go saving files directly under the root (/) of the file system, don't put them in /usr or some subdirectory or in another reserved place. This pretty much limits your access to safe file systems.

One of the main security systems on a UNIX system, which is naturally implemented on every Linux machine as well, is the log-keeping facility, which logs all user actions, processes, system events etc. The configuration file of the so-called syslogdaemon determines which and how long logged information will be kept. The default location of all logs is /var/log, containing different files for access log, server logs, system messages etc.

In /var we typically find server data, which is kept here to separate it from critical data such as the server program itself and its configuration files. A typical example on Linux systems is /var/www, which contains the actual HTML pages, scripts and images that a web server offers. The FTP-tree of an FTP server (data that can be downloaded by a remote client) is also best kept in one of /var's subdirectories. Because this data is publicly accessible and often changeable by anonymous users, it is safer to keep it here, away from partitions or directories with sensitive data.

On most workstation installations, /var/spool will at least contain an at and a cron directory, containing scheduled tasks. In office environments this directory usually contains lpd as well, which holds the print queue(s) and further printer configuration files, as well as the printer log files.

On server systems we will generally find /var/spool/mail, containing incoming mails for local users, sorted in one file per user, the user's inbox. A related directory is mqueue, the spooler area for unsent mail messages. These parts of the system can be very busy on mail servers with a lot of users. News servers also use the /var/spool area because of the enormous amounts of messages they have to process.

The /var/lib/rpm directory is specific to RPM-based (RedHat Package Manager) distributions; it is where RPM package information is stored. Other package managers generally also store their data somewhere in /var.