02. Sorting sort

The sort command can be used to sort contents from standard in or a file. You've probably seen and used it before in the context of pipelining.

There are tons of different ways you can specify sorts, and we'll go over the important ones.

Terminology

The input to sort should be a stream of records separated by a newline character. Each characteristic (or column) is known as the field, which is separated by some user-specified character (the delimiter). This is most often a tab, comma, or semicolon.

Default sort settings

By default, the sort command will sort alphabetically by the first field.

$ cat employees.txt
Caelestinus, Joon     Directory
Photios, Roland       CEO
Eliseus, Meindert     Secretary
Pino, Derryl          Assistant
Gemini, Klaos         Assistant
$ sort employees.txt
# Sort first field by alphabetical order
Caelestinus, Joon     Directory
Eliseus, Meindert     Secretary
Gemini, Klaos         Assistant
Photios, Roland       CEO
Pino, Derryl          Assistant

Sorting multiple files at once

You can sort multiple files at once and have the output be a sorted file.

$ sort names1.txt names2.txt names3.txt
# (names1.txt, names2.txt, names3.txt are unsorted files)
Avi
Bobby
Brian
Cat
Derrick
Duke
Irvin
JB
Jen
Jim
Jizelle
John
Ragmar
Shawn
Telly

Merging two sorted files

You can use the -m option to merge presorted input files. The sort command is able to easily perform this because its implementation is based on merge sort.

$ cat sortedMales.txt
Blowers, Nigel
Gilbertson, Collin
Imhoff, Parker
Peru, Colton
Shisler, Odell
Twine, Leonardo
$ cat sortedFemales.txt
Aurea, Levin
Deshazo, Taryn
Duong, Arminda
Moorhead, Alyse
Murrieta, Caroll
Yen, Beata
$ sort -m sortedMales.txt sortedFemales.txt
Aurea, Levin
Blowers, Nigel
Deshazo, Taryn
Duong, Arminda
Gilbertson, Collin
Imhoff, Parker
Moorhead, Alyse
Murrieta, Caroll
Peru, Colton
Shisler, Odell
Twine, Leona

Specifying a delimiter

To specify the delimiter, we can use the -t option, followed by the delimiter wrapped in single quotations.

Specifying the field

Furthermore, we can specify the field number with the -k option, followed by the field (column) index. With just a single integer (ie. -k2), the sort key will begin at column 2 and extend to the end of the line. However, if we use -k2,2 it will only sort based on the second column.

Sorting by sub-fields

If there is a group of parameters that you'd like to sort within a field, you can do so with the decimal point. For example, say you have a date field in the 3rd column formatted by MM-DD-YYYY. It would make sense to order by year first, then month, then day, right?

$ sort -t ',' -k 3.7n -k 3.1n -k 3.4n dates.txt
Tel,Aziz,12-31-1989
Ping,Sarah,09-29-1990
Het,Holm,01-01-1992
Hum,Horry,04-23-1995
Ith,Rebecca,06-12-2001

Ignoring blank space

To ignore any leading blank spaces, use the -b option.

$ sort -t',' -k2,2 -b sortedMales.txt
Moorhead, Alyse
Duong, Arminda
Murrieta, Caroll
Gilbertson, Collin
Peru, Colton
Twine, Leonardo
Aurea, Levin
Blowers, Nigel
Shisler, Odell
Imhoff, Parker
Deshazo, Taryn

Now our file is sorted by the second field (first name).

Sorting numbers

If you try to sort a text full of numbers, the sorted output may not be what you expect. For example, try sorting a list of even numbers from 1 - 10.

$ sort oneToTen.txt
10
2
4
6
8

To get the correct results, we must pass in the -n option, which sorts by integer value.

$ sort -n oneToTen.txt
2
4
6
8
10

More options

There are plenty more options you can check through the man page. Here are the most frequently used ones.

-b
Ignore all leading whitespaces.
-c
Just check that input is correctly sorted. Exit code will be nonzero if not.
-d
Use dictionary order, sorting on whitespace and alphanumeric characters.
-f
Case-insensitive sort. f is for "folding" each letter to its corresponding lowercase letter.
-g
General numeric order. Compare as floating-points.
-k
Define sort key field. -k 2 would sort on the second field (aka second column).
-i
Ignore non-printable characters.
-o
Specify the out file. Default is standard out.
-m
Merge already sorted input files.
-n
Compare fields as integer values. String to integer conversion.
-r
Reverse sorting order.
-R
Random sort (not truly random).
-t
Specify the character to use as the separator of fields instead of whitespace. -t ';' would separate fields with a semicolon.
-u
Save only the first unique record only. All other repeated records with an equal key are discarded.

After sorting, we can use uniq to find characteristics of our file! Let's learn about that next.

Take your Linux skills to the next level!

System Admin Handbook

Take your Linux skills to the next level! Try Linux & UNIX

This book approaches system administration in a practical way and is an invaluable reference for both new administrators and experienced professionals. It details best practices for every facet of system administration, including storage management, network design and administration, email, web hosting, scripting, and much more.

$ Check price
74.9974.99Amazon 4.5 logo(142+ reviews)

More Linux & UNIX resources

Aching back from coding all day?

Prism Glasses

Aching back from coding all day? Try Back Problems

Ever feel achy from sitting crunched up on your computer table? Try lying down with these optical glasses that allow you to work on your laptop while lying flat on your back. This is the perfect solution with those with limited mobility or those who wish to prevent neck cramps and back strains.

$ Check price
4.454.45Amazon 4 logo(128+ reviews)

More Back Problems resources

Ad