2

System Info

alinuxchap@libertus-desktop:/usr/share/X11/xkb $ uname -a
Linux libertus-desktop 6.12.25+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.12.25-1+rpt1 (2025年04月30日) aarch64 GNU/Linux
alinuxchap@libertus-desktop:/usr/share/X11/xkb $ 

Cmd

cd /home/alinuxchap/Documents/shared/dat/EDS/it
echo "" > output.txt
while read author; do
 echo $author
 pdfgrep "$author" *.pdf |& tee -a output.txt
done <authors.txt

Problem

  • grep outputs text matches in bold and red
  • I don't want to use grep -p as I also need to see the 'snippet' of context the term is being used in

It's useful for archiving command output as 'logs'; the same problem arises with copy and paste, as that doesn't preserve rtf either.

asked Jun 17 at 17:35
5
  • I think it's a clear question, but what's output.txt for? Commented Jun 17 at 17:45
  • Hi there! Edited the code to make it clearer now, but don't hesitate to ask more questions @wobtax Commented Jun 17 at 18:29
  • 1
    I can't see the edit yet, but from what it looks like, output.txt is just going to be an empty file. Is the intent to put the output of the while read ... command into output.txt? Commented Jun 18 at 21:51
  • Oops, must have forgotten to hit save! So yes, this completes the scenario: on the one hand I wanted to have formatted text, but on the other I wanted to save it, so this way I was sort of compromising by doing a bit of both, though not at the same time XD. Commented Jun 20 at 20:04
  • 1
    Ah, got it! If you happen to want your loop to read from an input file and overwrite an output file, you can do: while read author; do ... done <input.txt >output.txt, or even while read author; do ... done >output.txt <input.txt. Commented Jun 20 at 20:11

1 Answer 1

3

For saving colors to a file, try adding --color always:

pdfgrep --color always "$author" *.pdf > output.txt

Then you can cat the file and it'll still be bold and red where you need it.

If for some reason you have a version of pdfgrep that doesn’t have these options (or you came here because you’re using a different program), you can instead put script -q /dev/null in front of the command to make a "fake" terminal:

script -q /dev/null pdfgrep "$author" *.pdf > output.txt

For converting to RTF, ansifilter is working pretty well for me:

pdfgrep --color always "$author" *.pdf | ansifilter --rtf > output.rtf

Basically, how this works is that pdfgrep writes some non-printing ANSI color codes. Red text starts with \e[0;31m and you can reset colors and formatting with \e[0m, so writing with red text looks like:

echo $'here is \e[0;31mred text\e[0m' # the $ makes it interpret `\e` sequences

But pdfgrep knows whether it's in a terminal that supports printing colors, so by default it will only insert these characters when they will do anything. You can override it with --color always.

answered Jun 17 at 17:57

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.