0

Debian GNU/Linux 11 (bullseye), grep (GNU grep) 3.6

I need find string in current directory within all files (doc, docx and pdf), grep command not working for me:

grep -ril "word" .

It doesn't output anything. What's wrong?

asked Jun 3 at 12:51
0

1 Answer 1

10

All three formats need to be converted to text before they can be searched using tools such as grep.

For "old-style" .doc files, use catdoc:

catdoc file.doc | grep word

For OOXML .docx files, use docx2txt:

docx2txt < file.docx | grep word

or

docx2txt file.docx - | grep word

For PDF files, use pdfgrep:

pdfgrep word file.pdf

or pdftotext:

pdftotext file.pdf - | grep word

If you switch to ripgrep you can use a preprocessor:

#!/bin/sh -
if [ ! -s "1ドル" ]; then exec cat; fi
case "1ドル" in
*.pdf)
 exec pdftotext - -
 ;;
*.doc)
 exec catdoc -
 ;;
*.docx)
 exec docx2txt - -
 ;;
*)
 exec cat
 ;;
esac

Save this to a file, make it executable (chmod 755), and use it with --pre:

rg --pre /path/to/preprocessor word

See the ripgrep guide for tips on reducing the overhead of the preprocessor.

answered Jun 3 at 14:24
0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.