As a system administrator, familiarizing yourself with different grep with regex is crucial to your daily tasks. The grep
commands will help you know your systems like the back of your hand and make troubleshooting a breeze.
The grep command is one of the most useful commands to search through text files for specific patterns. And in this tutorial, you’ll learn many ways to utilize grep
commands.
Read on and never be lost for information on your system again!
Prerequisites
This tutorial will be a hands-on demonstration. If you’d like to follow along, be sure you have the following.
- A Linux machine – This demo uses Ubuntu 20.04, but any Linux distribution will work.
- A non-root user account with sudo privileges.
Finding Exact Specified Strings
Manually finding a string in a configuration file is possible but can hurt your eyes trying to differentiate one string from another having the same text color. Luckily, the grep command lets you highlight strings you’re looking for.
1. Download the GNU General Public License as the example used in this tutorial to your working directory for an example configuration file.
2. Run the grep
command below to search through the downloaded configuration file and highlight (--color=always
) the string GNU
. You pass grep
the string you want to search for as an argument, along with the filename (GPL-3
).
sudo grep --color=always "GNU" GPL-3
As you can see below, the grep command prints out every line in the GPL-3 file that matches the string “GNU” highlighted in red. This behavior is helpful when you’re troubleshooting a problem and need to see where a specific word is being used.
Perhaps you want to search for a string while ignoring the letter case. If so, append the -i flag to search for a string whether it’s in lower or upper case, like this: sudo grep –color=always -i “GNU” GPL-3.
Excluding Strings that Doesn’t Match a Specific Text
So far, you’ve learned how to find lines that contain a specific exact string, with or without the case being taken into account. But what if you only want to see the lines that don’t contain an exact string? Appending the -v
or -invert-match
flag with the grep
command lets you do just that.
Run the grep
command below to search through the GPL-3
configuration file for the string the
. This command prints out all of the lines that don’t contain (-v
) the string the
.
sudo grep -v "the" GPL-3
Since the -i
flag was not set, the grep
command prints out the lines of strings that contain a string (”the”) on different letter cases, as shown below. This behavior is useful when excluding certain lines from your grep search.
You can also use the -n or –line-number flag to print out the line number of each search match, like this: sudo grep -vn -i “the” GPL-3. The -n flag comes in handy when working with source code files. You can reference the line number to quickly jump to the line in your source code file that you’re looking for.
Using Regex Anchors to Match the Beginning and End of a String
In the previous examples, you’ve learned to use grep
to search for lines that contain a specific string. In those examples, the regex engine matched characters or a string. On the other hand, Anchors do not match any character at all. Anchors match a position before and after characters.
The caret symbol (^) matches the position before the first character in a string, while the dollar symbol ($) matches the position after the last character in a string.
The anchors are helpful when you have a string consisting of multiple lines, where a line break is broken, and you want to work with lines rather than the entire string.
Run the grep
command below to search for the line of strings that start with GNU
.
sudo grep "^GNU" GPL-3
Similarly, you can run the grep
command below to search for the line of strings that end with GNU
.
sudo grep "GNU$" GPL-3
Below, the output is blank since no line matches the given regex.
Searching Strings by Matching Any Character
Instead of matching strings, perhaps you’re looking for strings that contain a single character when you don’t remember the exact spelling. If so, you can use a metacharacter (.) in a regular expression to match any single character.
Using periods is helpful when you want to match multiple instances of a character. For example, the regular expression “c.t” matches the strings “cat”, “cot”, and “cut”, and so on.
Run the grep
command below to see how the .
symbol works in practice. This command searches through the GPL-3
file for lines that contain the string cept
. The two adjacent dots in the regular expression indicate that any character can exist at the specified location.
sudo grep --color=always "..cept" GPL-3
Notice below that the grep
command produces tons of results, as the string cept
appears multiple times in the file in strings like unacceptable, exceptions, acceptance, and so on.
Searching Strings by Matching Sets of Characters
Like searching strings by matching any character, you can use the bracket [] meta character in a regular expression to match specific characters. For example, the regular expression [abc]
would match the characters: “a”, “b”, and “c”.
Run the grep
command below search through the GPL-3
file for any lines that contain the string “too” or “two.”
sudo grep --color=always "t[wo]o" GPL-3
Additionally, you can use the range operator to specify a range of elements instead of listing out every individual element. So the regular expression [a-c]
would match any lowercase letter from “a” to “c,” while the regular expression [A-Z]
would match any uppercase letter from “A” to “Z.”
Execute the below grep
command to search for every line of strings that begin with a capital (^[A-Z]
) letter in the GPL-3
file.
sudo grep --color=always "^[A-Z]" GPL-3
Searching Strings by Predefined Classes
Apart from the metacharacters, the grep
command also supports a number of different predefined classes of characters that you can use in your regular expressions. Using these predefined classes whenever possible is recommended as they are more likely compatible with different grep
implementations.
The most common classes are shown in the table below:
Run the grep
command below to search for every string line that begins with a capital letter in the GPL-3
file. This command is similar to matching sets of characters, but instead, you use the predefined character class [:upper:]
.
sudo grep --color=always "^[[:upper:]]" GPL-3
Escaping Metacharacters
You’ve just searched strings with the help of metacharacters, which you’ll often encounter when using grep
commands. Each metacharacter has a special meaning in regular expressions. So you must escape metacharacters with backslashes (\
) to tell grep
not to use their special meaning.
The most common metacharacters are periods (.) and brackets ([]).
Suppose you plan to look for an IP address (with lots of dots in it) and nothing else. Without escaping the dots (metacharacter), grep
will print out strings, even those not IP addresses.
Run the below grep
command to find the IP address 192.168.1.18
, and see how grep
interprets the dots’ special meaning.
sudo grep "192.168.1.18" GPL-3
As shown below, grep
printed the lines of strings from the GPL-3 file, one containing 192.168.1.18 strings and even one that is not an IP address (192a168b1c18).
Now, execute the below command to tell grep
to interpret dots (\\.
) as literal.
sudo grep --color=always "192\\.168\\.1\\.18" GPL-3
You can see that the grep
command only prints the string with the 192.168.1.18 IP address in it, as shown below.
Conclusion
In this article, you have learned how to use grep
with regex to find strings in files in different patterns. You’ve also learned how to escape metacharacters used in regular expressions, which comes in handy not to mess everything up when writing a script. And you can now confidently use grep
to find text patterns in files with great precision.
With this power at your fingertips, why not start to dig deeper and find the information you need and analyze data.