Brackets allow you to specify a single character from a group . For example, if you wanted any single vowel, you can use [aeiou].
$ ls /usr/bin | grep 'b[aeiou]t'
batch bitesize.d smbutil
To negate all characters within brackets, precede the characters within the brackets with a caret (^)
$ ls /usr/bin | grep 'b[^aeiou]t'
rwbytype.d
This would specify some text pattern that has a single character that is not [aeiou] between b and t.
We can specify a range if we want a range of characters or numbers.
$ ls /usr/bin | grep '[a-d][e-g][h-l]'
afhash
afida
afinfo
cancel
git-receive-pack
ldapdelete
mdfind
snmpdelta
With this command, we selected words that contains a first letter from a, b, c, or d, a second letter from e, f, or g, and a third letter from h, i, j, k, or l. Notice how this sequence of three letters can appear anywhere in the word.
A severe downside to using the - metacharacter for range is that it's not portable due to different character collation orders. To explain this, we need to learn a bit of history.
Unix was first developed with just ASCII characters. These were the canonical English characters which had order from 0 to 127, including characters such as control codes, printable characters, and upper/lowercase letters with numbers and punctuation marks. For letters, we had an ordering for characters like:
As other countries began adopting Unix, they had to make room for more characters. They had to include special characters such as an e with an accent over it, or a c with a squiggly line beneath. Thus, some collations arose with an ordering like this:
You could probably imagine the problem already. An expression such as A-Z would capture all uppercase letters in the first example, but all letters except a in the second.
Thus, try not to use the range character too much. You can instead rely on Character Classes (below), which are POSIX standard.
To check your current locale, print the $LANG
variable.
$ echo $LANG
en_US.UTF-8
Because of the discrepancies in collation ordering, Unix provides several character classes in order to make shell scripts more portable. Here is a list of the character classes:
When using character classes you must place them within brackets.
$ ls /usr/bin/ | grep '[[:digit:]][[:alpha:]][[:digit:]]'
a2p5.16
a2p5.18
s2p5.16
s2p5.18
This matched any files that had a sequence containing a digit, an alphabet character, followed by another digit.
When metacharacters are placed within brackets, they lose their special meaning.
The following code would match any listings with a minus symbol (-), a period (.) or the letter x.
$ ls /usr/bin/ | grep '[-.x]'
... weblatency.d wish8.4 wish8.5 xar xargs xattr xattr-2.6 xattr-2.7 xcode-select ...
If you want to specify the bracket (]) or the minus symbol (-), place them first in the list.
In some languages, two letters in sequence may identify itself as a one unit.
For example, if we were to consider the characters 'ts' as one unit, we could do so by placing them in brackets and periods [.ts.].
Furthermore, we can specify characters that have some variations such as an accent mark or tilde. By having the expression [=a=], we can specify all variations of the letter a. This includes à, á, â, and ã.
The Linux Command Line takes you from your very first terminal keystrokes to writing full programs in Bash, the most popular Linux shell. Along the way you'll learn the timeless skills handed down by generations of gray-bearded, mouse-shunning gurus: file navigation, environment configuration, command chaining, pattern matching with regular expressions, and more.
$ Check priceThis foam seat cushion relieves lowerback pain, numbness and pressure sores by promoting healthy weight distribution, posture and spine alignment. Furthermore, it reduces pressure on the tailbone and hip bones while sitting. Perfect for sitting on the computer desk for long periods of time.
$ Check priceAd