All of UNIX is
case sensitive. A command with even a single letter's
capitalization altered is considered to be a completely
different command. The same goes for files, directories,
configuration file formats, and the syntax of all native programming
languages.
In addition to directories and ordinary text files, there are
other types of files, although all files contain the same kind
of data (i.e., a list of bytes). The hidden file is a file
that will not ordinarily appear when you type the command
ls to list the contents of a directory. To see a
hidden file you must use the command
ls -a. The
-a option means to list all files as well as
hidden files. Another variant is
ls -l, which lists the
contents in long format. The
- is used in this way to
indicate variations on a command. These are called
command-line options or command-line arguments,
and most UNIX commands can take a number of them. They can be
strung together in any way that is convenient [Commands
under the GNU free software license are superior in this way:
they have a greater number of options than traditional UNIX
commands and are therefore more flexible.], for example,
ls -a -l,
ls -l -a, or
ls -al --any
of these will list all files in long
format.
All GNU commands take the additional arguments
-h and
--help. You can type a command with just this on the
command-line and get a usage summary. This is some brief
help that will summarize options that you may have forgotten if
you are already familiar with the command--it will
never be an exhaustive description of the usage. See the later
explanation about
man pages.
The difference between a hidden file and an ordinary file
is merely that the file name of a hidden file starts with
a period. Hiding files in this way is not for security, but for
convenience.
The option
ls -l is somewhat cryptic for the novice.
Its more explanatory version is
ls --format=long.
Similarly, the all option can be given as
ls --all, and means the same thing as
ls -a.
Although commands usually do not display a message when they execute [The computer accepted and processed the command.
] successfully, commands do report errors in a consistent format. The format
varies from one command to another but often appears as follows: command-name
:what was attempted
:error message. For example, the command
ls -l qwerty gives an error
ls: qwerty: No such file or
directory. What actually happened was that the command
ls attempted
to read the file
qwerty. Since this file does not exist, an error code
2 arose. This error code corresponds to a situation where a file or directory
is not being found. The error code is automatically translated into the sentence
No such file or directory. It is important to understand the distinction between
an explanatory message that a command gives (such as the messages reported by
the
passwd command in the previous chapter) and an error code that
was just translated into a sentence. The reason is that a lot of different kinds
of problems can result in an identical error code (there are only about a hundred
different error codes). Experience will teach you that error messages do not
tell you what to do, only what went wrong, and should not be taken as gospel.
The file
/usr/include/asm/errno.h
contains a complete list of basic error codes. In addition to these, several
other header files [Files ending in
.h] might
define their own error codes. Under UNIX, however, these are 99%
of all the errors you are ever likely to get. Most of them will
be meaningless to you at the moment but are included in
Table 4.1 as a reference.
ls can produce a lot of output if there are a large number of files
in a directory. Now say that we are only interested in files that ended with
the letters
tter. To list only these files, you can use
ls *tter.
The
* matches any number of any other characters. So, for example,
the files
Tina.letter,
Mary_Jones.letter and the file
splatter,
would all be listed if they were present, whereas a file
Harlette would
not be listed. While the
* matches any length of characters, the
? matches only one character. For example, the command
ls ?ar*
would list the files
Mary_Jones.letter and
Harlette.
When naming files, it is a good idea to choose names that group files of the
same type together. You do this by adding an extension to the file name
that describes the type of file it is. We have already demonstrated this by
calling a file
Mary_Jones.letter instead of just
Mary_Jones.
If you keep this convention, you will be able to easily list all the files that
are letters by entering
ls *.letter. The file name
Mary_Jones.letter
is then said to be composed of two parts: the name,
Mary_Jones,
and the extension,
letter.
Some common UNIX extensions you may see are:
.a
Archive.
lib*.a is a static library.
.alias
X Window System font alias catalog.
.avi
Video format.
.au
Audio format (original Sun Microsystems generic sound file).
.awk
awk program source file.
.bib
bibtex LATEX bibliography source file.
.bmp
Microsoft Bitmap file image format.
.bz2
File compressed with the
bzip2 compression program.
.cc,
.cxx,
.C,
.cpp
C++ program source code.
.cf,
.cfg
Configuration file or script.
.cgi
Executable script that produces web page output.
.conf,
.config
Configuration file.
.csh
csh shell script.
.c
C program source code.
.db
Database file.
.dir
X Window System font/other database directory.
.deb
Debian package for the Debian distribution.
.diff
Output of the
diff program indicating the difference between files or source trees.
.dvi
Device-independent file. Formatted output of
.tex LATEX file.
.el
Lisp program source.
.g3
G3 fax format image file.
.gif,
.giff
GIF image file.
.gz
File compressed with the
gzip compression program.
.htm,
.html,
.shtm,
.html
Hypertext Markup Language. A web page of some sort.
.h
C/C++ program header file.
.i
SWIG source, or C preprocessor output.
.in
configure input file.
.info
Info pages read with the
info command.
.jpg,
.jpeg
JPEG image file.
.lj
LaserJet file. Suitable input to a HP LaserJet printer.
.log
Log file of a system service. This file grows with status messages of some system program.
.lsm
LINUX Software Map entry.
.lyx
LyX word processor document.
.man
Man page.
.mf
Meta-Font font program source file.
.pbm
PBM image file format.
.pcf
PCF image file--intermediate representation for fonts. X Window System font.
.pcx
PCX image file.
.pfb
X Window System font file.
.pdf
Formatted document similar to PostScript or dvi.
.php
PHP program source code (used for web page design).
.pl
Perl program source code.
.ps
PostScript file, for printing or viewing.
.py
Python program source code.
.rpm
RedHat Package Manager
rpm file.
.sgml
Standard Generalized Markup Language. Used to create documents to be converted to many different formats.
.sh
sh shell script.
.so
Shared object file.
lib*.so is a Dynamically Linked Library. [Executable program code shared by more
than one program to save disk space and memory.]
.spd
Speedo X Window System font file.
.tar
tarred directory tree.
.tcl
Tcl/Tk source code (programming language).
.texi,
.texinfo
Texinfo source. Info pages are compiled from these.
.tex
TEX or LATEX document. LATEX is for document processing and typesetting.
.tga
TARGA image file.
.tgz
Directory tree that has been archived with
tar, and then compressed with
gzip. Also a package for the Slackware distribution.
.tiff
TIFF image file.
.tfm
LATEX font metric file.
.ttf
Truetype font.
.txt
Plain English text file.
.voc
Audio format (Soundblaster's own format).
.wav
Audio format (sound files common to Microsoft Windows).
.xpm
XPM image file.
.y
yacc source file.
.Z
File compressed with the
compress compression program.
.zip
File compressed with the
pkzip (or
PKZIP.EXE for DOS) compression program.
.1,
.2 ...
Man page.
In addition, files that have no extension and a capitalized descriptive name are
usually plain English text and meant for your reading. They come bundled with
packages and are for documentation purposes. You will see them hanging around all over
the place.
Some full file names you may see are:
AUTHORS
List of people who contributed to or wrote a package.
ChangeLog
List of developer changes made to a package.
COPYING
Copyright (usually GPL) for a package.
INSTALL
Installation instructions.
README
Help information to be read first, pertaining to the directory the
README is in.
TODO
List of future desired work to be done to package.
BUGS
List of errata.
NEWS
Info about new features and changes for the layman about this package.
There is a way to restrict file listings to within the ranges of certain characters.
If you only want to list the files that begin with A through M, you can
run
ls [A-M]*. Here the brackets have a special meaning--they
match a single character like a
?, but only those given by the range.
You can use this feature in a variety of ways, for example,
[a-dJW-Y]*
matches all files beginning with
a,
b,
c,
d,
J,
W,
X or
Y; and
*[a-d]id matches
all files ending with
aid,
bid,
cid or
did;
and
*.{cpp,c,cxx} matches all files ending in
.cpp,
.c
or
.cxx.
This way of specifying a file name is called a glob expression. Glob
expressions are used in many different contexts, as you will see later.
The command
cp stands for copy. It duplicates one or more files. The format is
cp <file> <newfile>
cp <file> [<file> ...] <dir>
or
cp filenewfile
cp file[file...]dir
The above lines are called a usage summary. The
<
and
> signs mean that you don't actually type out these
characters but replace
<file> with a file name of your own.
These are also sometimes written in italics like,
cpfilenewfile. In rare cases they are written in capitals like,
cp FILE NEWFILE.
<file> and
<dir> are called
parameters. Sometimes they are obviously numeric, like a
command that takes
<ioport>. [Anyone emailing me to
ask why typing in literal,
<,
i,
o,
p,
o,
r,
t and
> characters did not work will
get a rude reply.] These are common conventions used to specify
the usage of a command. The
[ and
] brackets are
also not actually typed but mean that the contents between them are
optional. The ellipses
... mean that
<file> can be
given repeatedly, and these also are never actually typed. From
now on you will be expected to substitute your own parameters by
interpreting the usage summary. You can see that the second of the
above lines is actually just saying that one or more file names can
be listed with a directory name last.
From the above usage summary it is obvious that there are two ways to use the
cp command. If the last name is not a directory, then
cp
copies that file and renames it to the file name given. If the last name is
a directory, then
cp copies all the files listed into that
directory.
The usage summary of the
ls command is as follows:
ls [-l, --format=long] [-a, --all] <file> <file> ... ls -al
where the comma indicates that either option is valid. Similarly,
with the
passwd command:
passwd [<username>]
You should practice using the
cp command now by moving some of your
files from place to place.
The
cd command is used to take you to different directories. Create
a directory
new with
mkdir new. You could create a
directory
one by doing
cd new and then
mkdir one,
but there is a more direct way of doing this with
mkdir new/one. You
can then change directly to the
one directory with
cd new/one.
And similarly you can get back to where you were with
cd ../... In
this way, the
/ is used to represent directories within directories.
The directory
one is called a subdirectory of
new.
The command
pwd stands for present working directory (also called
the current directory) and tells what directory you are
currently in. Entering
pwd gives some output like
/home/<username>.
Experiment by changing to the root directory (with
cd /) and then back
into the directory
/home/<username> (with
cd /home/<username>).
The directory
/home/<username> is called your home directory,
and is where all your personal files are kept. It can be used at any time with
the abbreviation
~. In other words, entering
cd /home/<username>
is the same as entering
cd ~. The process whereby a
~
is substituted for your home directory is called tilde expansion.
To remove (i.e., erase or delete) a file, use the command
rm <filename>. To remove a directory,
use the command
rmdir <dir>. Practice using these two commands. Note
that you cannot remove a directory unless it is empty. To remove a directory
as well as any contents it might contain, use the command
rm -R <dir>.
The
-R option specifies to dive into any subdirectories of
<dir>
and delete their contents. The process whereby a command dives into subdirectories
of subdirectories of ... is called recursion.
-R stands for recursively.
This is a very dangerous command. Although you may be used to ``undeleting'' files
on other systems, on UNIX a deleted file is, at best, extremely difficult to
recover.
The
cp command also takes the
-R option, allowing it to copy
whole directories. The
mv command is used to move files and directories.
It really just renames a file to a different directory. Note that with
cp
you should use the option
-p and
-d with
-R to preserve all
attributes of a file and properly reproduce symlinks (discussed later). Hence, always
use
cp -dpR <dir> <newdir> instead of
cp -R <dir> <newdir>.
Commands can be given file name arguments in two ways. If you
are in the same directory as the file (i.e., the file is in the current directory),
then you can just enter the file name on its own (e.g.,
cp my_file new_file).
Otherwise, you can enter the full path name, like
cp /home/jack/my_file
/home/jack/new_file. Very often administrators use the notation
./my_file
to be clear about the distinction, for instance,
cp ./my_file ./new_file.
The leading
./ makes it clear that both files are relative to the current
directory. File names not starting with a
/ are called relative path names,
and otherwise, absolute path names.
(See Chapter 16 for a complete overview of all
documentation on the system, and also how to print manual pages
in a properly typeset format.)
The command
man [<section>|-a] <command> displays help on
a particular topic and stands for manual. Every command on the entire
system is documented in so-named man pages.
In the past few years a new format of documentation, called info, has evolved.
This is considered the modern way to document commands, but most system documentation
is still available only through
man. Very few packages
are not documented in
man however.
Man pages are the authoritative reference on how a command works because they
are usually written by the very programmer who created the command.
Under UNIX,
any printed documentation should be considered as being second-hand information.
Man pages, however, will often not contain the underlying concepts needed for understanding
the context in which a command is used. Hence, it is not possible for a person to
learn about UNIX purely from man pages. However, once you have the necessary background
for a command, then its man page becomes an indispensable source of information
and you can discard other introductory material.
Now, man pages are divided into sections, numbered 1 through 9. Section 1 contains
all man pages for system commands like the ones you have been using. Sections
2-7 contain information for programmers and the like, which
you will probably not have to refer to just yet. Section 8 contains pages specifically
for system administration commands. There are some additional sections labeled
with letters; other than these, there are no manual pages besides the sections
1 through 9. The sections are
...
/man1
User programs
...
/man2
System calls
...
/man3
Library calls
...
/man4
Special files
...
/man5
File formats
...
/man6
Games
...
/man7
Miscellaneous
...
/man8
System administration
...
/man9
Kernel documentation
You should now use the
man command to look up the manual pages for
all the commands that you have learned. Type
man cp,
man mv,
man rm,
man mkdir,
man rmdir,
man passwd,
man cd,
man pwd, and of course
man man. Much of the
information might be incomprehensible to you at this stage. Skim through the pages
to get an idea of how they are structured and what headings they usually contain.
Man pages are referenced with notation like
cp(1), for the
cp
command in Section 1, which can be read with
man 1 cp. This notation
will be used from here on.
info pages contain some excellent reference and tutorial information
in hypertext linked format. Type
info on its own to go
to the top-level menu of the entire
info hierarchy. You can also
type
info <command> for help on many basic commands.
Some packages will, however, not have info pages, and other UNIX
systems do not support
info at all.
info is an interactive program with keys to
navigate and search documentation. Inside info, typing
will invoke the help screen from where you can learn more commands.
A calculator program that handles arbitrary precision (very large)
numbers. It is useful for doing any kind of calculation on the command-line.
Its use is left as an exercise.
cal [[0-12] 1-9999]
Prints out a nicely formatted calender
of the current month, a specified month, or a specified whole year. Try
cal 1 for fun, and
cal 9 1752, when the pope had a few days
scrapped to compensate for round-off error.
cat <filename> [<filename> ...]
Writes the contents of all the
files listed to the screen.
cat can join a lot of files together with
cat <filename> <filename> ... > <newfile>. The file
<newfile> will
be an end-on-end concatenation of all the files specified.
clear
Erases all the text in the current terminal.
date
Prints out the current date and time. (The command
time,
though, does something entirely different.)
df
Stands for disk free and tells you how much free space
is left on your system. The available space usually has the units of kilobytes
(1024 bytes) (although on some other UNIX systems this will be 512 bytes or
2048 bytes). The right-most column tells the directory (in combination with
any directories below that) under which that much space is available.
dircmp
Directory compare. This command compares directories
to see if changes have been made between them. You will often want to see where
two trees differ (e.g., check for missing files), possibly on different computers.
Run
man dircmp (that is,
dircmp(1)). (This is a System 5 command and is not present on LINUX.
You can, however, compare directories with the Midnight Commander,
mc).
du <directory>
Stands for disk usage and prints out the amount
of space occupied by a directory. It recurses into any subdirectories and can
print only a summary with
du -s <directory>. Also try
du --max-depth=1
/var and
du -x / on a system with
/usr and
/home
on separate partitions. [See page .]
dmesg
Prints a complete log of all messages
printed to the screen during the bootup process.
This is useful if you blinked when your machine was initializing. These messages
might not yet be meaningful, however.
echo
Prints a message to the terminal. Try
echo 'hello there',
echo $[10*3+2],
echo `$[10*3+2]'. The command
echo
-e allows interpretation of certain backslash sequences, for example
echo -e "\a", which prints a bell,
or in other words, beeps the terminal.
echo -n does the same without
printing the trailing newline. In other words, it does not cause a wrap to the
next line after the text is printed.
echo -e -n "\b",
prints a back-space character only, which will erase the last character printed.
exit
Logs you out.
expr <expression>
Calculates the numerical expression
expression.
Most arithmetic operations that you are accustomed to will work. Try
expr
5 + 10 '*' 2. Observe how mathematical precedence is obeyed (i.e., the
*
is worked out before the
+).
file <filename>
Prints out the type of data contained
in a file.
file portrait.jpg will tell you that
portrait.jpg
is a
JPEG image data, JFIF standard. The command
file detects an enormous
amount of file types, across every platform.
file works by checking whether
the first few bytes of a file match certain tell-tale byte sequences. The byte
sequences are called magic numbers. Their complete list is stored
in
/usr/share/magic. [The word ``magic'' under UNIX normally
refers to byte sequences or numbers that have a specific meaning or implication.
So-called magic numbers are invented for source code,
file formats, and file systems.]
free
Prints out available free memory. You will notice two listings:
swap space and physical memory. These are contiguous as far as the user is concerned.
The swap space is a continuation of your installed memory that exists on disk.
It is obviously slow to access but provides the illusion of much more
available RAM and avoids the possibility of ever running out of memory
(which can be quite fatal).
head [-n <lines>] <filename>
Prints the first
<lines>
lines of a file or 10 lines if the
-n option is not given. (See also
tail below).
hostname [<new-name>]
With no options,
hostname
prints the name of your machine, otherwise it sets the name to
<new-name>.
kbdrate -r <chars-per-second> -d <repeat-delay>
Changes the
repeat rate of your keys.
Most users will like this rate set to
kbdrate -r 32 -d 250
which unfortunately is the fastest the PC can go.
more
Displays a long file by stopping at the end of each page. Run
the following:
ls -l /bin > bin-ls, and then try
more bin-ls. The first
command creates a file with the contents of the output of
ls. This
will be a long file because the directory
/bin has a great many entries.
The second command views the file. Use the space bar to page through
the file. When you get bored, just press . You can also try
ls -l /bin | more
which will do the same thing in one go.
less
The GNU version of
more, but with extra features. On your
system, the two commands may be the same. With
less, you can use the
arrow keys to page up and down through the file. You can do searches by pressing
, and then typing in a word to search for
and then pressing . Found words will
be highlighted, and the text will be scrolled to the first found word. The important
commands are:
-
Go to the end of a file.
-ssss
Search backward through a file for the text ssss.
ssss
Search forward through a file for the text ssss.
[Actually ssss is a regular expression. See Chapter 5 for more info.]
-
Scroll forward and keep trying to read more of the file in case
some other program is appending to it--useful for log files.
nnn-
Go to line nnn of the file.
Quit. Used by many UNIX text-based applications (sometimes -).
(You can make
less stop beeping in the irritating way that it does by
editing the file
/etc/profile and adding the lines
LESS=-Q export LESS
and then logging out and logging in again.
But this is an aside that will make more sense later.)
lynx <url>
Opens a URL [URL stands for Uniform Resource Locator--a web address.]at the console. Try
lynx http://lwn.net/.
links <url>
Another text-based web browser.
nohup <command> &
Runs a command in the background, appending any
output the command may produce to the file
nohup.out in your home directory.
nohup has the useful feature that the command will continue to run even
after you have logged out. Uses for
nohup will become obvious later.
sleep <seconds>
Pauses for
<seconds> seconds. See also
usleep.
sort <filename>
Prints a file with lines sorted in alphabetical
order. Create a file called
telephone with each line containing a short
telephone book entry. Then type
sort telephone, or
sort telephone
| less and see what happens.
sort takes many interesting options to sort in
reverse (
sort -r), to eliminate duplicate entries (
sort -u),
to ignore leading whitespace (
sort -b), and so on. See the
sort(1)
for details.
strings [-n <len>] <filename>
Writes out a binary file, but strips any unreadable
characters. Readable groups of characters are placed on separate lines. If you
have a binary file that you think may contain something interesting but looks
completely garbled when viewed normally, use
strings to sift out the interesting
stuff: try
less /bin/cp and then try
strings /bin/cp.
By default
strings does not print sequences smaller than 4. The
-n
option can alter this limit.
split ...
Splits a file into many separate files. This might have
been used when a file was too big to be copied onto a floppy disk and needed
to be split into, say, 360-KB pieces. Its sister,
csplit, can split
files along specified lines of text within the file. The commands are seldom
used on their own but are very useful within programs that manipulate text.
tac <filename> [<filename> ...]
Writes the contents of all the
files listed to the screen, reversing the order of the lines--that is, printing
the last line of the file first.
tac is
cat backwards and behaves
similarly.
tail [-f] [-n <lines>] <filename>
Prints the last
<lines>
lines of a file or 10 lines if the
-n option is not given. The
-f
option means to watch the file for lines being appended to the end of it.
(See also
head above.)
uname
Prints the name of the UNIX operating system
you are currently using. In this case, LINUX.
uniq <filename>
Prints a file with duplicate lines deleted. The
file must first be sorted.
usleep <microseconds>
Pauses for
<microseconds>
microseconds (1/1,000,000 of a second).
wc [-c] [-w] [-l] <filename>
Counts the number of bytes
(with
-c for
character), or words (with
-w), or lines (with
-l) in a file.
whatis <command>
Gives the first
line of the man page corresponding to
<command>, unless no such
page exists, in which case it prints
nothing appropriate.
Those who come from the DOS world may remember the famous
Norton Commander file manager. The GNU project has a
Free clone called the Midnight Commander,
mc.
It is essential to at least try out this package--it allows
you to move around files and directories extremely rapidly,
giving a wide-angle picture of the file system. This will
drastically reduce the number of tedious commands you will
have to type by hand.
You should practice using each of these commands if you have
your sound card configured. [I don't want to give the
impression that LINUX does not have graphical applications to do
all the functions in this section, but you should be aware that for every
graphical application, there is a text-mode one that works better and
consumes fewer resources.] You may also find that some of these
packages are not installed, in which case you can come back to
this later.
play [-v <volume>] <filename>
Plays linear audio formats
out through your sound card. These formats are
.8svx,
.aiff,
.au,
.cdr,
.cvs,
.dat,
.gsm,
.hcom,
.maud,
.sf,
.smp,
.txw,
.vms,
.voc,
.wav,
.wve,
.raw,
.ub,
.sb,
.uw,
.sw, or
.ul files. In other words, it plays almost
every type of ``basic'' sound file there is: most often this will be
a simple Windows
.wav file. Specify
<volume> in percent.
rec <filename>
Records from your microphone into a file.
(
play and
rec are from the same package.)
mpg123 <filename>
Plays audio from MPEG files level
1, 2, or 3. Useful options are
-b 1024 (for increasing the
buffer size to prevent jumping) and
--2to1 (down-samples by
a factor of 2 for reducing CPU load). MPEG files contain sound
and/or video, stored very compactly using digital signal processing
techniques that the commercial software industry seems to think are very
sophisticated.
cdplay
Plays a regular music CD.
cdp is the
interactive version.
aumix
Sets your sound card's volume, gain, recording
volume, etc. You can use it interactively or just enter
aumix
-v <volume> to immediately set the volume in percent. Note that
this is a dedicated mixer program and is considered to be an
application separate from any that play music. Preferably do not set the
volume from within a sound-playing application, even if it claims
this feature--you have much better control with
aumix.
mikmod --interpolate -hq --renice Y <filename>
Plays
Mod files. Mod files are a special type of audio format that
stores only the duration and pitch of the notes that constitute a song,
along with samples of each musical instrument needed to play the
song. This makes for high-quality audio with phenomenally small
file size.
mikmod supports 669, AMF, DSM, FAR, GDM, IMF,
IT, MED, MOD, MTM, S3M, STM, STX, ULT, UNI, and XM audio
formats--that is, probably every type in existence. Actually,
a lot of excellent listening music is available on the Internet in Mod
file format. The most common formats are
.it,
.mod,
.s3m, and
.xm. [Original
.mod files are the
product of Commodore-Amiga computers and had only four tracks. Today's 16
(and more) track Mod files are comparable to any recorded music.]
You usually use - to stop an application or command that runs
continuously. You must type this at the same prompt where you
entered the command. If this doesn't work, the section on
processes (Section 9.5) will explain
about signalling a running application to quit.
Files typically contain a lot of data that one can imagine might be
represented with a smaller number of bytes. Take for example the
letter you typed out. The word ``the'' was probably repeated many
times. You were probably also using lowercase letters most of the
time. The file was by far not a completely random set of bytes, and
it repeatedly used spaces as well as using some letters more than
others. [English text in fact contains, on average, only about
1.3 useful bits (there are eight bits in a byte) of data per byte.]Because of this the file can be compressed to take up less
space. Compression involves representing the same data by using a
smaller number of bytes, in such a way that the original data can be
reconstructed exactly. Such usually involves finding patterns in the
data. The command to compress a file is
gzip <filename>,
which stands for GNU zip. Run
gzip on a file in your home
directory and then run
ls to see what happened. Now, use
more to view the compressed file. To uncompress the file
use
gzip -d <filename>. Now, use
more to view
the file again. Many files on the system are stored
in compressed format. For example, man pages are often stored
compressed and are uncompressed automatically when you read
them.
You previously used the command
cat to view a file. You can use the
command
zcat to do the same thing with a compressed file. Gzip a file
and then type
zcat <filename>. You will see that the contents of the
file are written to the screen. Generally, when commands and files have a
z
in them they have something to do with compression--the letter
z stands for
zip. You can use
zcat <filename> | less to view a compressed
file proper. You can also use the command
zless <filename>, which does
the same as
zcat <filename> | less. (Note that your
less may
actually have the functionality of
zless combined.)
A new addition to the arsenal is
bzip2. This is a compression program
very much like
gzip, except that it is slower and compresses 20%-30%
better. It is useful for compressing files that will be downloaded from the
Internet (to reduce the transfer volume). Files that are compressed with
bzip2
have an extension
.bz2. Note that the improvement in compression depends
very much on the type of data being compressed. Sometimes there will be negligible
size reduction at the expense of a huge speed penalty, while occasionally it is
well worth it. Files that are frequently compressed and uncompressed should never
use
bzip2.
You can use the command
find to search for files. Change to the root
directory, and enter
find. It will spew out all the files it can see
by recursively descending [Goes into each subdirectory and all its subdirectories, and repeats the command
find.
] into all subdirectories. In other words,
find, when executed from
the root directory, prints all the files on the system.
find
will work for a long time if you enter it as you have--press - to stop
it.
Now change back to your home directory and type
find again. You will
see all your personal files. You can specify a number of options to
find
to look for specific files.
find -type d
Shows only directories and not the files they contain.
find -type f
Shows only files and not the directories that contain
them, even though it will still descend into all directories.
find -name <filename>
Finds only files that have the name
<filename>.
For instance,
find -name '*.c' will find all files that end in a
.c extension (
find -name *.c without the quote characters
will not work. You will see why later).
find -name Mary_Jones.letter
will find the file with the name
Mary_Jones.letter.
find -size [[+|-]]<size>
Finds only files that have a
size larger (for
+) or smaller (for
-) than
<size>
kilobytes, or the same as
<size> kilobytes if the sign is not specified.
find <directory> [<directory> ...]
Starts
find in each of the specified directories.
There are many more options for doing just about any type of search
for a file. See
find(1) for more details (that is, run
man 1 find).
Look also at the
-exec option which causes
find to execute a command
for each file it finds, for example:
find /usr -type f -exec ls '-al' '{}' ';'
find has the deficiency of actively reading directories to find files. This process is slow,
especially when you start from the root directory. An alternative command is
locate <filename>. This searches through a previously created database
of all the files on the system and hence finds files instantaneously. Its counterpart
updatedb updates the database of files used by
locate.
On some systems,
updatedb runs automatically every day at 04h00.
Very often you will want to search through a number of files to find a particular
word or phrase, for example, when a number of files contain lists of
telephone numbers with people's names and addresses. The command
grep
does a line-by-line search through a file and prints only those lines that
contain a word that you have specified.
grep has the command summary:
[The words word, string, or pattern are used synonymously
in this context, basically meaning a short length of letters and-or numbers
that you are trying to find matches for. A pattern can also be a string
with kinds of wildcards in it that match different characters, as we shall see
later.]
Run
grep for the word ``the'' to display all lines containing it:
grep
'the' Mary_Jones.letter. Now try
grep 'the' *.letter.
grep -n <pattern> <filename>
shows the line number in the file where the word was found.
grep -<num> <pattern> <filename>
prints out
<num> of
the lines that came before and after each of the lines in which the word was
found.
grep -A <num> <pattern> <filename>
prints out
<num> of
the lines that came
After each of the lines in which the word was found.
grep -B <num> <pattern> <filename>
prints out
<num> of
the lines that came
Before each of the lines in which the word was
found.
grep -v <pattern> <filename>
prints out only those lines that
do not contain the word you are searching for. [
You may think that the
-v option is no longer
doing the same kind of thing that
grep is advertised
to do: i.e., searching for strings. In fact, UNIX commands
often suffer from this--they have such versatility that their functionality
often overlaps with that of other commands. One actually never stops learning
new and nifty ways of doing things hidden in the dark corners of man pages.]
grep -i <pattern> <filename>
does the same as an ordinary
grep but
is case insensitive.
A package, called the
mtools package, enables
reading and writing to MS-DOS/Windows floppy
disks.
These are not standard UNIX commands but are packaged with most LINUX
distributions. The commands support Windows ``long file name'' floppy disks. Put
an MS-DOS disk in your
A: drive. Try
mdir A: touch myfile mcopy myfile A: mdir A:
Note that there is no such thing as an
A: disk
under LINUX. Only the
mtools package understands
A: in order to retain
familiarity for MS-DOS users. The complete list of commands is
Entering
info mtools will give detailed help. In general,
any MS-DOS command, put into lower case with an
m prefixed to it,
gives the corresponding LINUX command.
Never begin any work before you have a fail-safe method of backing it up.
One of the primary activities of a system administrator is to make backups.
It is essential never to underestimate the volatility [Ability to evaporate or become chaotic.
] of information in a computer. Backups of data are therefore continually made.
A backup is a duplicate of your files that can be used as a replacement
should any or all of the computer be destroyed. The idea is that all of the
data in a directory [As usual, meaning a directory and all its subdirectories and all the files in
those subdirectories, etc.
] are stored in a separate place--often compressed--and can be retrieved
in case of an emergency. When we want to store a number of files in this way,
it is useful to be able to pack many files into one file so that we can perform
operations on that single file only. When many files are packed together into
one, this packed file is called an archive. Usually archives have the
extension
.tar, which stands for tape archive.
To create an archive of a directory, use the
tar command:
tar -c -f <filename> <directory>
Create a directory with a few files in it, and run the
tar command to
back it up. A file of
<filename> will be created. Take careful note
of any error messages that
tar reports. List the file and check that
its size is appropriate for the size of the directory you are archiving. You
can also use the verify option (see the man page) of the
tar
command to check the integrity of
<filename>. Now remove the directory,
and then restore it with the extract option of the
tar command:
tar -x -f <filename>
You should see your directory recreated with all its files intact. A nice option
to give to tar is
-v. This option lists all the files that are being added
to or extracted from the archive as they are processed, and is useful for monitoring
the progress of archiving. It is obvious that you can call your archive anything
you like, however; the common practice is to call it
<directory>.tar,
which makes it clear to all exactly what it is. Another important option is
-p which preserves detailed attribute information of files.
Once you have your
.tar file, you would probably want to compress it with
gzip.
This will create a file
<directory>.tar.gz, which is sometimes
called
<directory>.tgz for brevity.
A second kind of archiving utility is
cpio.
cpio is actually
more powerful than tar, but is considered to be more cryptic to use. The principles
of
cpio are quite similar and its use is left as an exercise.
When you type a command at the shell prompt, it has to be read off disk
out of one or other directory. On UNIX, all such executable
commands are located in one of about four directories. A file is
located in the directory tree according to its type, rather than
according to what software package it belongs to. For
example, a word processor may have its actual executable stored
in a directory with all other executables, while its font files
are stored in a directory with other fonts from all other packages.
The shell has a procedure for searching for executables when you type
them in. If you type in a command with slashes, like
/bin/cp,
then the shell tries to run the named program,
cp, out of the
/bin directory. If you just type
cp on its own,
then it tries to find the
cp command in each of the
subdirectories of your
PATH. To see what your
PATH
is, just type
echo $PATH
You will see a colon separated list of four or more directories. Note that
the current directory
. is not
listed. It is important that the current directory
not be listed for reasons of security. Hence, to execute a command
in the current directory, we hence always
./<command>.
To append, for example, a new directory
/opt/gnome/bin
to your
PATH, do
PATH="$PATH:/opt/gnome/bin" export PATH
LINUX supports the convenience of doing this in one line:
export PATH="$PATH:/opt/gnome/bin"
There is a further command,
which, to check whether a command is
locatable from the
PATH. Sometimes there are two commands of the same
name in different directories of the
PATH. [This is
more often true of Solaris systems than LINUX.] Typing
which <command> locates the one that your shell
would execute. Try:
which ls which cp mv rm which which which cranzgots
which is also useful in shell scripts to tell if
there is a command at all, and hence check whether a particular package
is installed, for example,
which netscape.
If a file name happens to begin with a
- then it would be impossible
to use that file name as an argument to a command. To overcome this circumstance,
most commands take an option
--. This option specifies that no more
options follow on the command-line--everything else must be treated as
a literal file name. For instance