01. Introduction to POSIX Regular Expressions

What are regular expressions?

Regular expressions are a group of text, consisting of characters that are symbolic or literal, which are used to identify patterns of text. Just as an example, a regular expression such as ^hello. would match all characters that start with the text "hello", then a single character. In this series we'll go over how to construct such regular expressions.

What is the POSIX standard?

POSIX, an acronym for Portable Operating System Interface, is a set of standards defined by the IEEE Computer Society for maintaining compatibility between operating systems. The standard was implemented in order to make software portable between variations of Unix and other operating systems.

How did POSIX affect Unix regular expressions?

Regular expressions vary from the languages it's implemented in as well with the tools used. On the command line, there used to be three different commands for regular expressions. grep included all basic regular expressions (BRE), while extended grep egrep included more notations that deemed it more powerful, for the cost of efficiency. The collection of features it includes is known extended regular expressions (ERE). Additionally, there was fast grep fgrep, which allowed for multiple fixed string matching. These variations were merged to grep by POSIX in 1992.

Now, this merging of basic regular expressions (BRE) and extended regular expressions (ERE) did not mean their notations were combined. When using grep, the default is set to BRE notation, but you can easily switch to ERE with the -E option.

Did I confuse you enough yet? Hopefully not... We'll go over which regex notations need the -E option, so no need to be lost!

Regular Expressions with grep

The command line uses the grep command, which stands for global regular expression printing to use regular expressions.

Why learn regex and grep?

The knowledge of grep and how to construct regular expressions is extremely powerful. For example, you may complete text manipulations such as searching and substitutions all in one line. These can be incorporated into a shell scripts to automate work-flow for fast and easy text processing. Regular expressions can also be used in text editors such as Vim and emacs, file viewers such as less and man, along with programming languages such as awk, python, and perl.

A sample test

Let's try out a simple grep command to get started. We won't be using any special characters, just literal values.

$ ls /usr/bin | grep 'zip'
bunzip2 bzip2 bzip2recover funzip gunzip gzip unzip unzipsfx zip zipcloak zipdetails zipdetails5.16 zipdetails5.18 zipgrep zipinfo zipnote zipsplit

Here, we can see all the commands with the word zip in them within our /usr/bin folder. Remember the bin folder is where all our commands are stored in binaries format.

Avoid Parsing ls

This article makes a good claim on why you shouldn't parse the results of ls. In this tutorial, we'll be parsing the contents in our /usr/bin folder, just as examples. However, be sure to give this article a quick read and understand that parsing the results of ls may give unexpected results.

Options with grep

Here is a list of useful options you can use with the grep command.

Print the number of matches.
Use extended regular expressions.
Input a list of patterns. Returns any matches from that list.
Using fixed strings (ignore special characters).
Read patterns from a newline-separated file.
Suppress the output of file-names.
Ignore casing.
Prints the name of files that weren't matched.
List names of files that match the pattern (instead of printing matched lines).
prefix each matching line w the number of the line within the file
Doesn't print anything, but exits quietly.
Search recursively through specified folder.
Suppresses error messages.
Print the lines that didn't match any patterns.

Now that you have an understanding of regular expressions, POSIX standards, and grep, let's learn about the two types of characters in regex.

Take your Linux skills to the next level!

The Linux Command Line

Take your Linux skills to the next level! Try Linux & UNIX

The Linux Command Line takes you from your very first terminal keystrokes to writing full programs in Bash, the most popular Linux shell. Along the way you'll learn the timeless skills handed down by generations of gray-bearded, mouse-shunning gurus: file navigation, environment configuration, command chaining, pattern matching with regular expressions, and more.

$ Check price
39.9539.95Amazon 4.5 logo(274+ reviews)

More Linux & UNIX resources

Aching back from coding all day?

Prism Glasses

Aching back from coding all day? Try Back Problems

Ever feel achy from sitting crunched up on your computer table? Try lying down with these optical glasses that allow you to work on your laptop while lying flat on your back. This is the perfect solution with those with limited mobility or those who wish to prevent neck cramps and back strains.

$ Check price
4.454.45Amazon 4 logo(128+ reviews)

More Back Problems resources