File Compression and Archiving

Sometimes it is useful to store a group of files in one file so that they can be backed up, easily transferred to another directory, or even transferred to a different computer. It is also sometimes useful to compress files into one file so that they use less disk space and download faster.

It is important to understand the distinction between an archive file and a compressed file. An archive file is a collection of files and directories that are stored in one file. The archive file is not compressed — it uses the same amount of disk space as all the individual files and directories combined. A compressed file is a collection of files and directories that are stored in one file and stored in a way that uses less disk space than all the individual files and directories combined. If you do not have enough disk space on your computer, you can compress files that you do not use very often or files that you want to save but do not use anymore. You can even create an archive file and then compress it to save disk space.

NoteNote
 

An archive file is not compressed, but a compressed file can be an archive file.

Using File Roller

Red Hat Linux includes a graphical utility called File Roller that can compress, decompress, and archive files and directories. File Roller supports common UNIX and Linux file compression and archiving formats, and has a simple interface and extensive help documentation if you need it. It is also integrated into the desktop environment and graphical file manager to make working with archived files easier.

To start File Roller click Main Menu => Accessories => Archive Manager. You can also start File Roller from a shell prompt by typing file-roller. Figure 12-1 shows File Roller in action.

TipTip
 

If you using a file manager (such as Nautilus), you can simply double-click the file you wish to unarchive or decompress to start File Roller. The File Roller browser window will appear with the decompressed/unarchived file in a folder for you to extract or browse.

Figure 12-1. File Roller in Action

Decompressing and Unarchiving with File Roller

To unarchive and/or decompress a file, click the Open toolbar button. A file menu will pop up, allowing you to choose the archive you wish to work with. For example, if you have a file called foo.tar.gz located in your home directory, highlight the file and click OK. The file will appear in the main File Roller browser window as a folder, which you can navigate by double-clicking the folder icon. File Roller preserves all directory and subdirectory hierarchies, which is convenient if you are looking for a particular file in the archive. You can extract individual files or entire archives by clicking the Extract button, choosing the directory you would like to save the unarchived files, and clicking OK.

Creating Archives with File Roller

If you need to free some hard drive space, or send multiple files or a directory of files to another user over email, File Roller allows you to create archives of your files and directories. To create a new archive, click New on the toolbar. A file browser will pop up, allowing you to specify an archive name and the compression technique (you can usually leave this set as Automatic and simply type in the file archive name and file name extension in the provided text box). Click OK and your new archive is now ready to be filled with files and directories. To add files to your new archive, click Add, which will pop up a browser window (Figure 12-2) that you can navigate to find the file or directory you want to be in the archive. Click OK when you are finished, and Close to close the archive.

Figure 12-2. Creating an Archive with File Roller

TipTip
 

There is much more you can do with File Roller than is explained here. Refer to the File Roller manual (available by clicking Help => Manual) for more information.

Compressing Files at the Shell Prompt

Compressed files use less disk space and download faster than large, uncompressed files. In Red Hat Linux you can compress files with the compression tools gzip, bzip2, or zip.

The bzip2 compression tool is recommended because it provides the most compression and is found on most UNIX-like operating systems. The gzip compression tool can also be found on most UNIX-like operating systems. If you need to transfer files between Linux and other operating system such as MS Windows, you should use zip because it is more commonly used on those other operating systems.

Table 12-1. Compression Tools

Compression ToolFile ExtensionUncompression Tool
gzip.gzgunzip
bzip2.bz2bunzip2
zip.zipunzip

By convention, files compressed with gzip are given the extension .gz, files compressed with bzip2 are given the extension .bz2, and files compressed with zip are given the extension .zip.

Files compressed with gzip are uncompressed with gunzip, files compressed with bzip2 are uncompressed with bunzip2, and files compressed with zip are uncompressed with unzip.

Bzip2 and Bunzip2

To use bzip2 to compress a file, type the following command at a shell prompt:

bzip2 filename

The file will be compressed and saved as filename.bz2.

To expand the compressed file, type the following command:

bunzip2 filename.bz2

The filename.bz2 is deleted and replaced with filename.

You can use bzip2 to compress multiple files and directories at the same time by listing them with a space between each one:

bzip2 filename.bz2 file1 file2 file3 /usr/work/school 

The above command compresses file1, file2, file3, and the contents of the /usr/work/school directory (assuming this directory exists) and places them in a file named filename.bz2.

TipTip
 

For more information, type man bzip2 and man bunzip2 at a shell prompt to read the man pages for bzip2 and bunzip2.

Gzip and Gunzip

To use gzip to compress a file, type the following command at a shell prompt:

gzip filename

The file will be compressed and saved as filename.gz.

To expand the compressed file, type the following command:

gunzip filename.gz

The filename.gz is deleted and replaced with filename.

You can use gzip to compress multiple files and directories at the same time by listing them with a space between each one:

gzip -r filename.gz file1 file2 file3 /usr/work/school 

The above command compresses file1, file2, file3, and the contents of the /usr/work/school directory (assuming this directory exists) and places them in a file named filename.gz.

TipTip
 

For more information, type man gzip and man gunzip at a shell prompt to read the man pages for gzip and gunzip.

Zip and Unzip

To compress a file with zip, type the following command:

zip -r filename.zip filesdir

In this example, filename.zip represents the file you are creating and filesdir represents the directory you want to put in the new zip file. The -r option specifies that you want to include all files contained in the filesdir directory recursively.

To extract the contents of a zip file, type the following command:

unzip filename.zip

You can use zip to compress multiple files and directories at the same time by listing them with a space between each one:

zip -r filename.zip file1 file2 file3 /usr/work/school 

The above command compresses file1, file2, file3, and the contents of the /usr/work/school directory (assuming this directory exists) and places them in a file named filename.zip.

TipTip
 

For more information, type man zip and man unzip at a shell prompt to read the man pages for zip and unzip.

Archiving Files at the Shell Prompt

A tar file is a collection of several files and/or directories in one file. This is a good way to create backups and archives.

Some of the options used with the tar are:

To create a tar file, type:

tar -cvf filename.tar files/directories

In this example, filename.tar represents the file you are creating and files/directories represents the files or directories you want to put in the archived file.

You can tar multiple files and directories at the same time by listing them with a space between each one:

tar -cvf filename.tar /home/mine/work /home/mine/school

The above command places all the files in the work and the school subdirectories of /home/mine in a new file called filename.tar in the current directory.

To list the contents of a tar file, type:

tar -tvf filename.tar

To extract the contents of a tar file, type:

tar -xvf filename.tar

This command does not remove the tar file, but it places copies of its contents in the current working directory.

Remember, the tar command does not compress the files by default. To create a tarred and bzipped compressed file, use the -j option:

tar -cjvf filename.tbz

tar files compressed with bzip2 are conventionally given the extension .tbz.

This command creates an archive file and then compresses it as the file filename.tbz. If you uncompress the filename.tbz file with the bunzip2 command, the filename.tbz file is removed and replaced with filename.tar.

You can also expand and unarchive a bzip tar file in one command:

tar -xjvf filename.tbz

To create a tarred and gzipped compressed file, use the -z option:

tar -czvf filename.tgz

tar files compressed with gzip are conventionally given the extension .tgz.

This command creates the archive file filename.tar and then compresses it as the file filename.tgz. (The file filename.tar is not saved.) If you uncompress the filename.tgz file with the gunzip command, the filename.tgz file is removed and replaced with filename.tar.

You can expand a gzip tar file in one command:

tar -xzvf filename.tgz

TipTip
 

Type the command man tar for more information about the tar command.