Programming
unix search shell command-line grep
Updated Sat, 21 May 2022 14:44:19 GMT

Use grep --exclude/--include syntax to not grep through certain files


I'm looking for the string foo= in text files in a directory tree. It's on a common Linux machine, I have bash shell:

grep -ircl "foo=" *

In the directories are also many binary files which match "foo=". As these results are not relevant and slow down the search, I want grep to skip searching these files (mostly JPEG and PNG images). How would I do that?

I know there are the --exclude=PATTERN and --include=PATTERN options, but what is the pattern format? The man page of grep says:

--include=PATTERN     Recurse in directories only searching file matching PATTERN.
--exclude=PATTERN     Recurse in directories skip file matching PATTERN.

Searching on grep include, grep include exclude, grep exclude and variants did not find anything relevant

If there's a better way of grepping only in certain files, I'm all for it; moving the offending files is not an option. I can't search only certain directories (the directory structure is a big mess, with everything everywhere). Also, I can't install anything, so I have to do with common tools (like grep or the suggested find).




Solution

Use the shell globbing syntax:

grep pattern -r --include=\*.cpp --include=\*.h rootdir

The syntax for --exclude is identical.

Note that the star is escaped with a backslash to prevent it from being expanded by the shell (quoting it, such as --include="*.cpp", would work just as well). Otherwise, if you had any files in the current working directory that matched the pattern, the command line would expand to something like grep pattern -r --include=foo.cpp --include=bar.cpp rootdir, which would only search files named foo.cpp and bar.cpp, which is quite likely not what you wanted.

Update 2021-03-04

I've edited the original answer to remove the use of brace expansion, which is a feature provided by several shells such as Bash and zsh to simplify patterns like this; but note that brace expansion is not POSIX shell-compliant.

The original example was:

grep pattern -r --include=\*.{cpp,h} rootdir

to search through all .cpp and .h files rooted in the directory rootdir.





Comments (5)

  • +0 – I don't know why, but I had to quote the include pattern like this: grep pattern -r --include="*.{cpp,h}" rootdir — Dec 09, 2011 at 07:41  
  • +6 – @topek: Good point -- if you have any .cpp/.h files in your current directory, then the shell will expand the glob before invoking grep, so you'll end up with a command line like grep pattern -r --include=foo.cpp --include=bar.h rootdir, which will only search files named foo.cpp or bar.h. If you don't have any files that match the glob in the current directory, then the shell passes on the glob to grep, which interprets it correctly. — Dec 14, 2011 at 22:51  
  • +7 – I just realized that the glob is used to only matching the filename. To exclude a whole directory one needs --exclude-dir option. Same rules apply though. Only directory filename is matched, not a path. — Sep 22, 2015 at 17:00  
  • +3--include doesn't seem to work after --exclude. I suppose it doesn't make sense to even try, except that I have an alias to grep with a long list of --exclude and --exclude-dir, which I use for searching code, ignoring libraries and swap files and things. I would've hoped that grep -r --exclude='*.foo' --include='*.bar' would work, so I could limit my alias to --include='*.bar' only, but it seems to ignore the --include and include everything that's not a .foo file. Swapping the order of the --include and --exclude works, but alas, that's not helpful with my alias. — Aug 10, 2016 at 13:49  
  • +1 – how can we read someone's mind to get rules for this PATTERN. Half of hour I can't find any description of what are they waiting there for — Aug 09, 2018 at 08:22