02. Compressing Files gzip, gunzip, zcat, zless

gzip

The most popular Unix File compression command is gzip. gzip repalces the original file with a compressed version, which has a .gz file extension.

$ ls -l
-rw-r--r--  1 JohnDoe       staff   285 May  9 13:27 README.md
-rw-r--r--  1 JohnDoe       staff  6459 May  9 13:27 index.html
-rw-r--r--  1 JohnDoe       staff  5341 May  9 13:27 todo.txt
$ gzip README.md todo.txt index.html
$ ls -l
-rw-r--r--  1 JohnDoe  staff    56 May  9 13:27 README.md.gz
-rw-r--r--  1 JohnDoe  staff  2626 May  9 13:27 index.html.gz
-rw-r--r--  1 JohnDoe  staff  2113 May  9 13:27 todo.txt.gz

As you can see, the files have been individually compressed. Note, however, that you can't gzip a folder.

gunzip

To reverse a file back to its uncompressed form, use gunzip. The uncompressed file will have the same permissionas and timestamp as when it was gzipped.

$ gunzip README.md index.html todo.txt
-rw-r--r--  1 JohnDoe  staff   285 May  9 13:27 README.md
-rw-r--r--  1 JohnDoe  staff  6459 May  9 13:27 index.html
-rw-r--r--  1 JohnDoe  staff  5341 May  9 13:27 todo.txt

Note how we don't need to specify the .gz in the filename, as that is already assumed.

Options

There are several options you can use with gzip. Here are a list of the most common ones - be sure to check the man page for more.

-c
Write output to standard output and keep original files.
-d
Decompress - same as using gunzip.
-f
Force overwriting and compress links.
-h
Display usage information. may also be specified with --help
-k
Retail original files.
-l
List the compression ratio for each file compressed.
-r
Recursively compress files in the directory.
-t
Test the integrity of a compressed file.
-v
Verbose
-number
Set amount of compression. number is an integer in the range of 1 to 9. 1 is fastest, but has the least compression. 9 is slowest with the most compression. The default is 6.

Keeping original files

Suppose you want to keep the original files, and make an extra copy for the gzipped ones. Simply pass in the -k option.

$ gzip -k README.md
$ ls 
...
README.md   README.md.gz  
...

Viewing compression ratio

We can view the compression ratio with the -l option.

$ gzip index.html.gz
$ gzip -l index.html.gz
  compressed uncompressed  ratio uncompressed_name
        2626         6459  59.3% index.html
$ gzip -l README.md.gz
compressed uncompressed  ratio uncompressed_name
        56          285  80.3% README.md

Seeing our ratio, we can tell that our README.md file must have a lot of repeated elements!

zcat and zless

zcat is the same as using gunzip -c. It will unzip your file and print to standard out.

$ zcat README.md.gz | less
# unzipped and now you can read with less
$ zless 
# same function as above

Compressing already compressed files

Make sure you don't compress an already compressed format. File types such as .mp3 and .jpeg have already been compressed, so a further compression may cause the file to become larger.

As you can tell, gzip was not meant to compress a group of multiple files - we use the tar command for that, which we'll see in the next lesson.

Take your Linux skills to the next level!

How Linux Works

Take your Linux skills to the next level! Try Linux & UNIX

In this completely revised second edition of the perennial best seller How Linux Works, author Brian Ward makes the concepts behind Linux internals accessible to anyone curious about the inner workings of the operating system. Inside, you'll find the kind of knowledge that normally comes from years of experience doing things the hard way.

$ Check price
39.9539.95Amazon 5 logo(114+ reviews)

More Linux & UNIX resources

Aching back from coding all day?

Prism Glasses

Aching back from coding all day? Try Back Problems

Ever feel achy from sitting crunched up on your computer table? Try lying down with these optical glasses that allow you to work on your laptop while lying flat on your back. This is the perfect solution with those with limited mobility or those who wish to prevent neck cramps and back strains.

$ Check price
4.454.45Amazon 4 logo(128+ reviews)

More Back Problems resources

Ad