Remove lines matching string in grep

I have some long configuration files to go through, and I would like to see just what is actually active in the .conf file, without any of the # tags. What can I use to output the lines without a # on it? I'm running Debian Wheezy with this command.

Canadian Luke

Posted 2013-08-20T16:00:21.723

Reputation: 22 162

Answers

If the file is named foo.conf:

grep -E '^[^#].*' foo.conf should do it.

Explanation:

-E: Support extended regular expressions!

'^[^#].*': A regular expression surrounded by single quotes.

^[^#].*: The regular expression itself.

^(at position 0 of the regular expression): Says "Match starting at the beginning of a line / the first character immediately following a newline, or the first character of the file itself."

[^#]: Says "Match exactly one character that is not the character #."

.*: Says "Match zero or more of any characters except a newline, for the rest of the line."

The net effect is that, if you have a file with contents like the following:

#foo
bar
#baz
fly

This regular expression will fail to match the first and third lines, because the first character at the start of lines 1 and 3 is in fact a #, so the part of the regular expression that requires exactly one non-# ([^#]) fails to match, so grep excludes that line.

The regular expression will then succeed to match the remainder of the lines, because the first character at the start of lines 2 and 4 is indeed not a #.

Building on our success so far, you can also match lines such as:

    #I am tricky!

(Notice that there is whitespace (tabs or spaces) in front of the comment, and since it's still a comment, we don't want it!)

by using the following:

grep -P '^\s*+[^#].*' foo.conf

(but the .* is not strictly required; I know, I know.)

So now we have:

-P: Support Perl-compatible regular expressions! (Hint: may not be available universally in all versions/implementations of grep, but it at least is available for a while now in GNU Grep.)

\s*+: The little bit of added regex that says, "Match zero or more whitespace characters, meaning spaces or tabs, and if you do see them, you MUST eat them." This latter part is very important, because without the possessive quantifier *+, the space could match as part of the [^#], which would trick the regular expression. Possessive quantifiers do not exist in POSIX Compatible (Basic or Extended) regular expression flavors, so we have to use PCRE here. There may be a way to do this without a possessive quantifier, but I am not aware of it.

allquixotic

Posted 2013-08-20T16:00:21.723

Reputation: 32 256

you don't need extended grep for this, the expression is "non-extended".

this will only remove lines where the comment is the first character, which is not part of the requirements

the .* is unnecessary in this case, though is useful sometimes with grep -o

< – Rich Homolka – 2013-08-20T16:09:35.670

Lesson learned; you are right. 2) Fixed with an update using PCRE and a possessive quantifier. He's looking for lines that are considered by the conf file parser to be comments, not just lines that happen to contain a # somewhere in them. Comments in most files are those lines which start either with a # in the first column, or with a # following some whitespace. 3) Lesson learned, thanks.

< – allquixotic – 2013-08-20T16:26:27.470

grep -v '#' file

You need to quote the # because your shell may interpret that as a comment as well.

The grep will remove any line with # on it. This may not be what you want. What may be more useful is removing the comments. You can use sed, or the 'Stream EDitor'. This may give you what you want.

sed -e 's/#.*//' < file | less

The regular expression says "find from a comment character to the end of the line, and remove it" then pipe to less to read it more easily.

If the comments are always in the first column, you can use the method as written by @somequixotic, which only shows lines where there is no comment in the first column.

Rich Homolka

Posted 2013-08-20T16:00:21.723

Reputation: 27 121

This answer is incorrect, because a line such as olol#olol (or, more practically with a script, a line such as ./baz.sh #this does ... will fail to match with this strategy, whereas if you use an extended regular expression as in my answer, it will match successfully. (This comment is based on the text of this answer as of its original revision.) – allquixotic – 2013-08-20T16:11:25.533

@somequixotic yeah, i thought of that after i posted. I was already editing it. Thanks though. – Rich Homolka – 2013-08-20T16:12:44.497

Removed my downvote – allquixotic – 2013-08-20T16:19:01.753