06. Backreferences and Anchors

Backreferences

When constructing a regular expression, you may need to reference some previously matched regex term. To do so, we may use backreferences.

The phrases that you want to reference are to be enclosed with parentheses (\(...\)). To reference it, use \digit, where \1 represents the first referenced phrase, \2, the second, and so on.

For example, a regex such as \(foo\)ber\(buzz\)*\2\1 simplifies to (foo)ber(buzz)*buzzfoo. Since the asterisk can have zero or many elements of the previous regex, this would match an text such as fooberbuzzbuzzfoo, fooberbuzzbuzzbuzzfoo, and so on.

Begins with and Ends with Anchors

So far we have learned how to use regex to search within lines, with no restrictions. Thus we can use zip to find elements that have the text 'zip' anywhere within its text. This is great, but what if we want to search strings that start or end with specific letters? To do precisely this, we use the ^ and $ symbols.

$ ls /usr/bin/ | grep '^zip' 
# files that start with zip
zip zipcloak zipdetails zipdetails5.16 zipdetails5.18 zipgrep zipinfo zipnote zipsplit
# files that end with zip $ ls /usr/bin | grep 'zip$'
funzip gunzip gzip unzip zip
# files that start and end with zip $ ls /usr/bin | grep '^zip$'
zip

Matching empty lines

The ^$ matches empty strings or lines, and may be used as grep -v ^$ to filter out all empty lines.

Using ^ within text

In BRE, when ^ or $ are used anywhere else beside the beginning or end of the line, it has no effect, so it turns back into a literal character. For example, [ab^cd] means the letters a, b, ^, c, or d.

The two uses of the caret ^

The ^ is used to signify that some characters begins with some regex. However, remember that if it appears as the first element in brackets, it'll change its meaning to negation.

Thus, we [^abcd] is different from ^[abcd]. The first matches any character not including a, b, c, or d. The second searches for text that starts with a, b, c, or d.

Aching back from coding all day?

Acupressure Mat & Pillow

Aching back from coding all day? Try Back Problems

Relieve your stress, back, neck and sciatic pain through 1,782 acupuncture points for immediate neck pain relief. Made for lower, upper and mid chronic back pain treatment, and improves circulation, sleep, digestion and quality of life.

$$ Check price
144.87144.87Amazon 4.5 logo(1,890+ reviews)

More Back Problems resources

Take your Linux skills to the next level!

How Linux Works

Take your Linux skills to the next level! Try Linux & UNIX

In this completely revised second edition of the perennial best seller How Linux Works, author Brian Ward makes the concepts behind Linux internals accessible to anyone curious about the inner workings of the operating system. Inside, you'll find the kind of knowledge that normally comes from years of experience doing things the hard way.

$ Check price
39.9539.95Amazon 5 logo(114+ reviews)

More Linux & UNIX resources

Ad