Master Grep with Regex and make Troubleshooting a Breeze

Published:22 March 2022 - 6 min. read

Nicholas Xuan Nguyen Image

Nicholas Xuan Nguyen

Read more tutorials by Nicholas Xuan Nguyen!

Azure Cloud Labs: these FREE, on‑demand Azure Cloud Labs will get you into a real‑world environment and account, walking you through step‑by‑step how to best protect, secure, and recover Azure data.

As a system administrator, familiarizing yourself with different grep with regex is crucial to your daily tasks. The grep commands will help you know your systems like the back of your hand and make troubleshooting a breeze.

The grep command is one of the most useful commands to search through text files for specific patterns. And in this tutorial, you’ll learn many ways to utilize grep commands.

Read on and never be lost for information on your system again!

Prerequisites

This tutorial will be a hands-on demonstration. If you’d like to follow along, be sure you have the following.

  • A Linux machine – This demo uses Ubuntu 20.04, but any Linux distribution will work.

Finding Exact Specified Strings

Manually finding a string in a configuration file is possible but can hurt your eyes trying to differentiate one string from another having the same text color. Luckily, the grep command lets you highlight strings you’re looking for.

1. Download the GNU General Public License as the example used in this tutorial to your working directory for an example configuration file.

2. Run the grep command below to search through the downloaded configuration file and highlight (--color=always) the string GNU. You pass grep the string you want to search for as an argument, along with the filename (GPL-3).

sudo grep --color=always "GNU" GPL-3

As you can see below, the grep command prints out every line in the GPL-3 file that matches the string “GNU” highlighted in red. This behavior is helpful when you’re troubleshooting a problem and need to see where a specific word is being used.

Finding Exact Specified Word
Finding Exact Specified Word

Perhaps you want to search for a string while ignoring the letter case. If so, append the -i flag to search for a string whether it’s in lower or upper case, like this: sudo grep –color=always -i “GNU” GPL-3.

Finding Exact Specific Word Ignoring the Letter Case
Finding Exact Specific Word Ignoring the Letter Case

Excluding Strings that Doesn’t Match a Specific Text

So far, you’ve learned how to find lines that contain a specific exact string, with or without the case being taken into account. But what if you only want to see the lines that don’t contain an exact string? Appending the -v or -invert-match flag with the grep command lets you do just that.

Run the grep command below to search through the GPL-3 configuration file for the string the. This command prints out all of the lines that don’t contain (-v) the string the.

sudo grep -v "the" GPL-3

Since the -i flag was not set, the grep command prints out the lines of strings that contain a string (”the”) on different letter cases, as shown below. This behavior is useful when excluding certain lines from your grep search.

Excluding Certain Lines with Exact Word
Excluding Certain Lines with Exact Word

You can also use the -n or –line-number flag to print out the line number of each search match, like this: sudo grep -vn -i “the” GPL-3. The -n flag comes in handy when working with source code files. You can reference the line number to quickly jump to the line in your source code file that you’re looking for.

Printing Out the Strings with Line Numbers
Printing Out the Strings with Line Numbers

Using Regex Anchors to Match the Beginning and End of a String

In the previous examples, you’ve learned to use grep to search for lines that contain a specific string. In those examples, the regex engine matched characters or a string. On the other hand, Anchors do not match any character at all. Anchors match a position before and after characters.

The caret symbol (^) matches the position before the first character in a string, while the dollar symbol ($) matches the position after the last character in a string.

The anchors are helpful when you have a string consisting of multiple lines, where a line break is broken, and you want to work with lines rather than the entire string.

Run the grep command below to search for the line of strings that start with GNU.

sudo grep "^GNU" GPL-3
Searching for Ling of Strings that Start with GNU
Searching for Ling of Strings that Start with GNU

Similarly, you can run the grep command below to search for the line of strings that end with GNU.

sudo grep "GNU$" GPL-3

Below, the output is blank since no line matches the given regex.

Searching for Lines of Strings that End with GNU
Searching for Lines of Strings that End with GNU

Searching Strings by Matching Any Character

Instead of matching strings, perhaps you’re looking for strings that contain a single character when you don’t remember the exact spelling. If so, you can use a metacharacter (.) in a regular expression to match any single character.

Using periods is helpful when you want to match multiple instances of a character. For example, the regular expression “c.t” matches the strings “cat”, “cot”, and “cut”, and so on.

Run the grep command below to see how the . symbol works in practice. This command searches through the GPL-3 file for lines that contain the string cept. The two adjacent dots in the regular expression indicate that any character can exist at the specified location.

sudo grep  --color=always "..cept" GPL-3

Notice below that the grep command produces tons of results, as the string cept appears multiple times in the file in strings like unacceptable, exceptions, acceptance, and so on.

Matching Any Character Within a Configuration File
Matching Any Character Within a Configuration File

Searching Strings by Matching Sets of Characters

Like searching strings by matching any character, you can use the bracket [] meta character in a regular expression to match specific characters. For example, the regular expression [abc] would match the characters: “a”, “b”, and “c”.

Run the grep command below search through the GPL-3 file for any lines that contain the string “too” or “two.”

sudo grep --color=always "t[wo]o" GPL-3
Finding Lines of Strings that Contain the Strings "too" or "two"
Finding Lines of Strings that Contain the Strings “too” or “two”

Additionally, you can use the range operator to specify a range of elements instead of listing out every individual element. So the regular expression [a-c] would match any lowercase letter from “a” to “c,” while the regular expression [A-Z] would match any uppercase letter from “A” to “Z.”

Execute the below grep command to search for every line of strings that begin with a capital (^[A-Z]) letter in the GPL-3 file.

sudo grep --color=always "^[A-Z]" GPL-3
Searching for Lines of Strings that Begin with Capital Letter
Searching for Lines of Strings that Begin with Capital Letter

Searching Strings by Predefined Classes

Apart from the metacharacters, the grep command also supports a number of different predefined classes of characters that you can use in your regular expressions. Using these predefined classes whenever possible is recommended as they are more likely compatible with different grep implementations.

The most common classes are shown in the table below:

Previewing Predefined Classes of Characters
Previewing Predefined Classes of Characters

Run the grep command below to search for every string line that begins with a capital letter in the GPL-3 file. This command is similar to matching sets of characters, but instead, you use the predefined character class [:upper:].

sudo grep --color=always "^[[:upper:]]" GPL-3
Searching for Line of Strings that Begin with a Capital Letter using Predefined Character Class [:upper:]
Searching for Line of Strings that Begin with a Capital Letter using Predefined Character Class [:upper:]

Escaping Metacharacters

You’ve just searched strings with the help of metacharacters, which you’ll often encounter when using grep commands. Each metacharacter has a special meaning in regular expressions. So you must escape metacharacters with backslashes (\) to tell grep not to use their special meaning.

The most common metacharacters are periods (.) and brackets ([]).

Suppose you plan to look for an IP address (with lots of dots in it) and nothing else. Without escaping the dots (metacharacter), grep will print out strings, even those not IP addresses.

Run the below grep command to find the IP address 192.168.1.18, and see how grep interprets the dots’ special meaning.

sudo grep "192.168.1.18" GPL-3

As shown below, grep printed the lines of strings from the GPL-3 file, one containing 192.168.1.18 strings and even one that is not an IP address (192a168b1c18).

Finding an IP Address without Escaping Metacharacters
Finding an IP Address without Escaping Metacharacters

Now, execute the below command to tell grep to interpret dots (\\.) as literal.

sudo grep --color=always "192\\.168\\.1\\.18" GPL-3

You can see that the grep command only prints the string with the 192.168.1.18 IP address in it, as shown below.

Escaping Meta Characters
Escaping Meta Characters

Conclusion

In this article, you have learned how to use grep with regex to find strings in files in different patterns. You’ve also learned how to escape metacharacters used in regular expressions, which comes in handy not to mess everything up when writing a script. And you can now confidently use grep to find text patterns in files with great precision.

With this power at your fingertips, why not start to dig deeper and find the information you need and analyze data.

Hate ads? Want to support the writer? Get many of our tutorials packaged as an ATA Guidebook.

Explore ATA Guidebooks

Looks like you're offline!