In all the many years using this site I’ve never had to ask a question because there has ALWAYS been an answer (usually numerous). I’m pretty sure this one Does too but for the life of me I cannot find it.
I have directory with a bunch of files which have numerous lines of random length.
a.txt
b.txt
c.txt
d.txt
Then I have single fileeg.txt
with a set list of strings
opq 111
rst 222
uvw 333
xyz 444
Each of the txt files has a single string I’d like to replace
a.txt has a#P#b
b.txt has c#P#d
c.txt has e#P#f
d.txt has g#P#h
I want to replace #P#
with the second ‘column’ from my file of strings. The #P#
occurs only one time per file (because I’ve put it there).
The result would be
a.txt has a111b
b.txt has c222d
c.txt has e333f
d.txt has g444h
The ‘constant‘ assumption is that there are as many lines ineg.txt
as there are.txt
files in my directory and they are in Alphabetical order. The lines ineg.txt
are sorted alphabetically as per ‘column’ 1
I’ve been trying to do it using awk
and sed
(well actually sd
) within a for loop but I’m failing to get it to read both ‘source’ and ‘target’ line by line.
I’m not fussy as to how I achieve the result. Currently I’m not working with many lines or files (15 lines and 15 files right now) but there will be times where there will be quite a lot more. I am using zsh as my shell on both an Arch & Debian based linux distro (WSL 2 at times)
Apologies if this has an answer. I’ve really tried to find it over the last two days while working on this project and my brain is now spent.
EDIT: Updated to clarify that the files in the directory have numerous lines of various length and that my given string #P#
occurs only once per file
5 Answers 5
Using GNU awk for "inplace" editing and ARGIND
:
awk -i inplace '
NR == FNR { map[NR]=2ドル }
NR != FNR { sub(/#P#/,map[ARGIND]) }
1' eg.txt ?.txt
The above assumes the replacement text from eg.txt
doesn't contain spaces or &
s.
-
1While I used the initial answer to solve my problem this one has been super handy because of its sheer versatility. I had no idea how versatile awk was. Really helped give me a better understanding of awk and the amazing uses for NR, FNR, ORS, OFS etc along with sub, map & ARGIND. Thank you!0m3rta3– 0m3rta32020年08月10日 11:02:19 +00:00Commented Aug 10, 2020 at 11:02
-
You're welcome. The main benefit to this approach is it'll run orders of magnitude faster than calling sed in a loop reading one line at a time - see why-is-using-a-shell-loop-to-process-text-considered-bad-practice.Ed Morton– Ed Morton2020年08月10日 12:37:47 +00:00Commented Aug 10, 2020 at 12:37
preparations
Only one line in each file.
$ grep -- . ?.txt
a.txt:a#P#b
b.txt:c#P#d
c.txt:e#P#f
d.txt:g#P#h
$ cat input
opq 111
rst 222
uvw 333
xyz 444
solution
Have a shell loop call sed
for each file:
for file in ?.txt; do
read -r dummy new_string rest
sed -- "s/#P#/$new_string/g" "$file"
done <input
a111b
c222d
e333f
g444h
Change that to sed -i
with GNU sed
or compatible or sed -i ''
with FreeBSD sed
or compatible if you are satisfied with the result for having the files changed.
The above assumes the lines of input
don't contain &
, /
, nor \
characters. If they may you would have to escape those with backslashes first.
-
Sorry I don’t think i was clear that only the single file has one ‘result’ per line. The txt files within the directory have numerous lines but the string
#P#
only occurs one time in each file. Your example has however solved a separate issue I had with a similar issue on separate files. So thank you0m3rta3– 0m3rta32020年08月03日 22:04:40 +00:00Commented Aug 3, 2020 at 22:04 -
@0m3rta3 That's what I assumed. Should not be a problem for my solution.Hauke Laging– Hauke Laging2020年08月03日 22:10:36 +00:00Commented Aug 3, 2020 at 22:10
-
But the grep preparation wouldn’t give that kind of output. Unless I’m missing something? Which is possible. I’m using this in a couple of ways. One example is where the directory of files are puppeteer scripts. The 2nd column of the single file are various URLs (all different). I then place an uncommon string (#P#) everywhere I want a URL inserted and then loop the list to insert each url into each script. So each script has different calls and different URLs. This is just an example one of the things I’m trying to do.0m3rta3– 0m3rta32020年08月03日 22:16:07 +00:00Commented Aug 3, 2020 at 22:16
-
URL’s might not be the best example because I don’t have to worry about escaping special characters in the majority of my cases. It’s just a simple replace(this)-with(that)0m3rta3– 0m3rta32020年08月03日 22:28:41 +00:00Commented Aug 3, 2020 at 22:28
-
@0m3rta3 The
grep
is just supposed to show to readers what my files look like so that my code can easily be tested. Thesed
should change just the one respective line in a multi-line file. Have you tried that at all? Of course, thesed
call needs a separator char which does not appear in your URLs or you will get into quoting hell.Hauke Laging– Hauke Laging2020年08月03日 22:35:20 +00:00Commented Aug 3, 2020 at 22:35
#!/bin/sh
mv eg.txt eg.input
awk 'NR==FNR{a[++i]=2ドル;next}{sub("#P#",a[++j]);print>(FILENAME".new")}' eg.input ./*.txt &&
for f in *.txt; do mv "$f.new" "$f"; done
mv eg.input eg.txt
eg.txt
is renamed to eg.input
and then back so that *.txt
in the awk line expands only to the files that should be modified.
NR==FNR{ #For the first file, eg.input
a[++i]=2ドル #Put the second field in the array `a`
next #Skip the rest of the code
}
{ #For the other files
sub("#P#",a[++j]) #Make the substitution
print>(FILENAME".new") #Print to the line to `FILENAME`.new
}
Then, in a for loop, the old *.txt
files contents are overwritten by the *.new
files contents. You may want to suppress the for loop until you are convinced that the *.new
files are correct.
Some awk implementations do not handle many open files (GNU awk does). If your awk exits with "too many open files" error, use this variant,
awk 'NR==FNR{a[++i]=2ドル;next}FNR==1{close(fn);fn=FILENAME".new"}{sub("#P#",a[++j]);print>fn}'
-
For some reason it seems to work so far up to inserting the new string. The result is just the #P# is removed. Ran it with -x set and I think I see why that is and will check it out shortly but I just wanted to say thank you as both these answers have taught me more in under and hour than I’ve learned trying to figure this out the last 3 days. Both your answers Introduced me to commands I wasn’t familiar with and looking at the Mans with them as context is SUPER handy. I upvoted but it wont show till I have 15 rep. Thank you again!0m3rta3– 0m3rta32020年08月03日 23:15:34 +00:00Commented Aug 3, 2020 at 23:15
-
@0m3rta3 I don't think
-x
will help you much as that is a Bash flag. From your description it seems for some reason your the array did not get populated, although I tested here in sample files and it worked. Well, if you learned from the answers, I'm already happy. Always much glad helping those who are willing to learn.Quasímodo– Quasímodo2020年08月03日 23:26:00 +00:00Commented Aug 3, 2020 at 23:26 -
Oh. Right. Thanks for pointing that’s out because I didn’t copy paste your script (was told from the start not to get into that habit and Ive tried to stick to it). But that comes with its own issues when I don’t pay attention. As such I used the bash shebang not regular sh. Hence the -x working. That might be where the issue actually is though. Lol. Dummy0m3rta3– 0m3rta32020年08月03日 23:30:25 +00:00Commented Aug 3, 2020 at 23:30
-
@0m3rta3 Always copy/paste code to/from this site, don't try to re-type it because then you end up wasting your time and other peoples time trying to help you with problems that simply don't exist in the code. There's nothing in the posted script that requires sh or bash nor requires you not to use either of them.Ed Morton– Ed Morton2020年08月04日 13:40:47 +00:00Commented Aug 4, 2020 at 13:40
Since you already are on zsh
and I presume you are with GNU's version of sed
, then we can do it like as shown in a two step process.
setopt extended_glob
sed -Ei -e '/#P#/R eg.txt' ./(^eg).txt
sed -Ei -e '/#P#/N;s/#P#(.*)\n.*\s(.*)/2円1円/' ./(^eg).txt
Brief explanation
Turn on extended globbing so that we can filter out a specific file eg.txt from the sed commandline.
Place the respective line from eg.txt after the #P# containing line with the help of the R command. Read up on this GNU specific command in the manual for more info.
Here we merge the two lines and do a cut n paste job to get the desired output.
The files were edited inplace (except eg.txt)
eg.txt
opq 111
rst 222
uvw 333
xyz 444
a.txt
a#P#b
12345
apple
b.txt
c#P#d
56788
command
j=1;for i in "a.txt" "b.txt" ; do b=`sed -n ''$j'p' eg.txt| awk '{print 2ドル}'`;sed "s/#P#/$b/g" $i;echo "=================";j=$(($j+1)); done
output
below are the output of a.txt
a111b
12345
apple
=================
below are the output of b.txt
c222d
56788
=================
-
Please let me known the reason for downvotePraveen Kumar BS– Praveen Kumar BS2020年08月08日 10:26:57 +00:00Commented Aug 8, 2020 at 10:26
-
Not sure who down voted but I’d imagine it’s because this doesn’t do what I was needing. At least the outputs are not at all what I was looking for as per my question.0m3rta3– 0m3rta32020年08月10日 10:55:14 +00:00Commented Aug 10, 2020 at 10:55