06. Backreferences and Anchors

Backreferences

When constructing a regular expression, you may need to reference some previously matched regex term. To do so, we may use backreferences.

The phrases that you want to reference are to be enclosed with parentheses (\(...\)). To reference it, use \digit, where \1 represents the first referenced phrase, \2, the second, and so on.

For example, a regex such as \(foo\)ber\(buzz\)*\2\1 simplifies to (foo)ber(buzz)*buzzfoo. Since the asterisk can have zero or many elements of the previous regex, this would match an text such as fooberbuzzbuzzfoo, fooberbuzzbuzzbuzzfoo, and so on.

Begins with and Ends with Anchors

So far we have learned how to use regex to search within lines, with no restrictions. Thus we can use zip to find elements that have the text 'zip' anywhere within its text. This is great, but what if we want to search strings that start or end with specific letters? To do precisely this, we use the ^ and $ symbols.

$ ls /usr/bin/ | grep '^zip' 
# files that start with zip
zip zipcloak zipdetails zipdetails5.16 zipdetails5.18 zipgrep zipinfo zipnote zipsplit
# files that end with zip $ ls /usr/bin | grep 'zip$'
funzip gunzip gzip unzip zip
# files that start and end with zip $ ls /usr/bin | grep '^zip$'
zip

Matching empty lines

The ^$ matches empty strings or lines, and may be used as grep -v ^$ to filter out all empty lines.

Using ^ within text

In BRE, when ^ or $ are used anywhere else beside the beginning or end of the line, it has no effect, so it turns back into a literal character. For example, [ab^cd] means the letters a, b, ^, c, or d.

The two uses of the caret ^

The ^ is used to signify that some characters begins with some regex. However, remember that if it appears as the first element in brackets, it'll change its meaning to negation.

Thus, we [^abcd] is different from ^[abcd]. The first matches any character not including a, b, c, or d. The second searches for text that starts with a, b, c, or d.

Aching back from coding all day?

Foam Seat Cushion

Aching back from coding all day? Try Back Problems

This foam seat cushion relieves lowerback pain, numbness and pressure sores by promoting healthy weight distribution, posture and spine alignment. Furthermore, it reduces pressure on the tailbone and hip bones while sitting. Perfect for sitting on the computer desk for long periods of time.

$ Check price
99.9599.95Amazon 4.5 logo(9,445+ reviews)

More Back Problems resources

Take your Linux skills to the next level!

How Linux Works

Take your Linux skills to the next level! Try Linux & UNIX

In this completely revised second edition of the perennial best seller How Linux Works, author Brian Ward makes the concepts behind Linux internals accessible to anyone curious about the inner workings of the operating system. Inside, you'll find the kind of knowledge that normally comes from years of experience doing things the hard way.

$ Check price
39.9539.95Amazon 5 logo(114+ reviews)

More Linux & UNIX resources

Ad