03. Finding Unique or Duplicate elements uniq

The uniq command takes in a sorted file and reports duplicated lines. With the proper options, it can be used to omit or report unique or repeated lines.

Use sort first!

If we use the uniq command on an unsorted list, it will report unexpected results, so make sure you sort before!

$ sort test.txt | uniq

Default output

The example file here simply lists the names of a few people. If we sort then call uniq, it will return all names in the file, just once.

$ cat names.txt
Bob
Chase
Jon
Clara
Bob
Theresa
Jon
Billy
Jonathan
Clara
Jonathan
Bob
Jonny
$ sort names.txt | uniq
Billy
Bob
Chase
Clara
Jon
Jonathan
Jonny
Theresa

Outputting repeated lines

To output lines that have repeated elements, we pass in the -d option.

$ sort names.txt | uniq -d
Bob
Clara
Jon
Jonathan

Outputting unique lines

To output lines that occur just once, pass in the -u option.

$ sort names.txt | uniq -c
Billy
Chase
Jonny
Theresa

With count

With the -c option, we can find how many times each occurrence occurs.

$ sort names.txt | uniq -c
   1 Billy
   3 Bob
   1 Chase
   2 Clara
   2 Jon
   2 Jonathan
   1 Jonny
   1 Theresa

More options

Here are a list of the most-used options. Check out the man page for more.

-c, --count
Precede each duplicate line occurrence by the number of times the duplicate occurs.
-d, --repeated
Output only repeated lines, rather than unique lines.
-f, --skip-fields=n
Ignore comparing the first n fields in each line.
-i, --ignore-case
Ignore case during the line comparisons.
-s --skip-chars=n
Skip the leading n characters of each line.
-u, --unique
Opposite of -d - output only unique lines. This is the default setting.

Take your Linux skills to the next level!

Command Line Kung Fu

Take your Linux skills to the next level! Try Linux & UNIX

Command Line Kung Fu is packed with dozens of tips and practical real-world examples. You won't find theoretical examples in this book. The examples demonstrate how to solve actual problems. The tactics are easy to find, too. Each chapter covers a specific topic and groups related tips and examples together.

$ Check price
14.9914.99Amazon 4.5 logo(27+ reviews)

More Linux & UNIX resources

Aching back from coding all day?

Acupressure Mat & Pillow

Aching back from coding all day? Try Back Problems

Relieve your stress, back, neck and sciatic pain through 1,782 acupuncture points for immediate neck pain relief. Made for lower, upper and mid chronic back pain treatment, and improves circulation, sleep, digestion and quality of life.

$$ Check price
144.87144.87Amazon 4.5 logo(1,890+ reviews)

More Back Problems resources

Ad