Regular expressions are a group of text, consisting of characters that are symbolic or literal, which are used to identify patterns of text. Just as an example, a regular expression such as ^hello. would match all characters that start with the text "hello", then a single character. In this series we'll go over how to construct such regular expressions.
POSIX, an acronym for Portable Operating System Interface, is a set of standards defined by the IEEE Computer Society for maintaining compatibility between operating systems. The standard was implemented in order to make software portable between variations of Unix and other operating systems.
Regular expressions vary from the languages it's implemented in as well with the tools used. On the command line, there used to be three different commands for regular expressions.
grep included all basic regular expressions (BRE), while extended grep
egrep included more notations that deemed it more powerful, for the cost of efficiency. The collection of features it includes is known extended regular expressions (ERE). Additionally, there was fast grep
fgrep, which allowed for multiple fixed string matching. These variations were merged to
grep by POSIX in 1992.
Now, this merging of basic regular expressions (BRE) and extended regular expressions (ERE) did not mean their notations were combined. When using
grep, the default is set to BRE notation, but you can easily switch to ERE with the
Did I confuse you enough yet? Hopefully not... We'll go over which regex notations need the
-E option, so no need to be lost!
The command line uses the
grep command, which stands for global regular expression printing to use regular expressions.
The knowledge of
grep and how to construct regular expressions is extremely powerful. For example, you may complete text manipulations such as searching and substitutions all in one line. These can be incorporated into a shell scripts to automate work-flow for fast and easy text processing. Regular expressions can also be used in text editors such as Vim and emacs, file viewers such as less and man, along with programming languages such as awk, python, and perl.
Let's try out a simple
grep command to get started. We won't be using any special characters, just literal values.
$ ls /usr/bin | grep 'zip'
bunzip2 bzip2 bzip2recover funzip gunzip gzip unzip unzipsfx zip zipcloak zipdetails zipdetails5.16 zipdetails5.18 zipgrep zipinfo zipnote zipsplit
Here, we can see all the commands with the word zip in them within our /usr/bin folder. Remember the bin folder is where all our commands are stored in binaries format.
This article makes a good claim on why you shouldn't parse the results of
ls. In this tutorial, we'll be parsing the contents in our /usr/bin folder, just as examples. However, be sure to give this article a quick read and understand that parsing the results of
ls may give unexpected results.
Here is a list of useful options you can use with the
Now that you have an understanding of regular expressions, POSIX standards, and grep, let's learn about the two types of characters in regex.
The Linux Command Line takes you from your very first terminal keystrokes to writing full programs in Bash, the most popular Linux shell. Along the way you'll learn the timeless skills handed down by generations of gray-bearded, mouse-shunning gurus: file navigation, environment configuration, command chaining, pattern matching with regular expressions, and more.$ Check price
Ever feel achy from sitting crunched up on your computer table? Try lying down with these optical glasses that allow you to work on your laptop while lying flat on your back. This is the perfect solution with those with limited mobility or those who wish to prevent neck cramps and back strains.$ Check price