I have some text-files I use to take notes in - just plain text, usually just using cat >> file
. Occasionally I use a blank line or two (just return - the new-line character) to specify a new subject/line of thought. At the end of each session, before closing the file with Ctrl+D, I typically add lots (5-10) blank lines (return-key) just to separate the sessions.
This is obviously not very clever, but it works for me for this purpose. I do however end-up with lots and lots of unnecessary blank lines, so I'm looking for a way to remove (most of) the extra lines. Is there a Linux-command (cut, paste, grep, ...?) that could be used directly with a few options? Alternatively, does anybody have an idea for a sed, awk or perl (well in any scripting-language really, though I'd prefer sed or awk) script that would do what I want? Writing something in C++ (which I actually could do myself), just seems like overkill.
Case #1: What I need is a script/command that would remove more than two (3 or more) consecutive blank lines, and replace them with just two blank lines. Though it would be nice if it also could be tweaked to remove more than one line (2 or more) and/or replace multiple blank lines with just one blank line.
Case #2: I could also use a script/command that would remove a single blank line between two lines of text, but leave multiple blank lines as is (though removing one of the blank lines would also be acceptable).
10 Answers 10
Case 1:
awk '!NF {if (++n <= 2) print; next}; {n=0;print}'
Case 2:
awk '!NF {s = s 0ドル "\n"; n++; next}
{if (n>1) printf "%s", s; n=0; s=""; print}
END {if (n>1) printf "%s", s}'
-
Since this use case is repeated frequently, I would suggest creating a script.ChuckCottrill– ChuckCottrill2013年10月10日 23:43:30 +00:00Commented Oct 10, 2013 at 23:43
You can use uniq
to collapse multiple instance of blank lines into one blank line, but it will also collapse lines which contain text if they are the same and below each other.
Case 1:
perl -i -ane '$n=(@F==0) ? $n+1 : 0; print if $n<=2'
Case 2:
perl -i -ane '$n=(@F==0) ? $n+1 : 0; print $n==2 ? "\n$_" : $n==1 ? "" : $_ '
You can address Case #1 like this with GNU sed:
sed -r ':a; /^\s*$/ {N;ba}; s/( *\n *){2,}/\n\n/'
That is, collect empty lines in pattern space, and if there are more than three or more lines, reduce it to two lines.
To join single-spaced lines, as in Case #2, you can do it like this:
sed -r '/^ *\S/!b; N; /\n *$/!b; N; /\S *$/!b; s/\n *\n/\n/'
Or in commented form:
sed -r '
/^ *\S/!b # non-empty line
N #
/\n *$/!b # followed by empty line
N #
/\S *$/!b # non-empty line
s/\n *\n/\n/ # remove the empty line
'
Just posting because I'm surprised nobody has mentioned it:
cat -s
The -s
option turns multiple consecutive blank lines into one blank line.
$ echo -e "hello\n\n\n\n\n\n\n\n\n\nworld" > hello.txt
$ cat -s hello.txt
hello
world
It's not directly customizable, but it could make it a little easier to organize your text. For instance, you could run the file through cat -s
first and then change every blank line into something else:
$ cat -s hello.txt | sed "s/^[[:space:]]*$/...\n . \n.../g"
hello
...
.
...
world
Following Anthon's suggestion to use "uniq"...
Remove leading, trailing and duplicate blank lines.
# Get large random string.
rand_str=; while [[ ${#rand_str} -lt 40 ]]; do rand_str=$rand_str$RANDOM; done
# Add extra lines at beginning and end of stdin.
(echo $rand_str; cat; echo $rand_str) |
# Convert empty lines to random strings.
sed "s/^$/$rand_str/" |
# Remove duplicate lines.
uniq |
# Remove first and last line.
sed '1d;$d' |
# Convert random strings to empty lines.
sed "s/$rand_str//"
In one long line:
(rand_str=; while [[ ${#rand_str} -lt 40 ]]; do rand_str=$rand_str$RANDOM; done; (echo $rand_str; cat; echo $rand_str) | sed "s/^$/$rand_str/" | uniq | sed '1d;$d' | sed "s/$rand_str//")
Or just use "cat -s".
I switched from parenthesis to curly braces in order to remain in the current shell context which I assume is more efficient. Note that curly braces require semicolon after last command and need a space for separation.
# Add extra blank lines at beginning and end.
# These will be removed in final step.
{ echo; cat; echo; } |
# Replace multiple blank lines with a single blank line.
cat -s |
# Remove first and last line.
sed '1d;$d'
In a single line.
{ { echo; cat; echo; } | cat -s | sed '1d;$d'; }
This solution takes care also of the last blank lines in the file:
sed -r -n '
/^ *$/!{p;b} # non-blank line - print and next cycle
h # blank line - save it in hold space
:loop
$b end # last line - go to end
n # read next line in pattern space
/^ *$/b loop # blank line - loop to next one
:end # pattern space has non-blank line or last blank line
/^ *$/{p;b} # last blank line: print and exit
H;x;p # non-blank line: print hold + pattern space and next cycle
'
Here is a simple way which removes multiple blank lines and replaced with single blank line; using sed, uniq, and tee commands.
$ cat myfile | uniq | tee myfile
-
Slick and it works (assuming no dup lines, like in code and config). Leaves some dups but only if the lines have whitespace chars.Michael Bushe– Michael Bushe2022年01月16日 19:50:53 +00:00Commented Jan 16, 2022 at 19:50
-
This would fail as soon as the file grows to exceed the pipe buffer's size.2024年08月03日 08:16:19 +00:00Commented Aug 3, 2024 at 8:16
Using Raku (formerly known as Perl_6)
CASE 1 -- Collapse 2+ blank lines to a single blank line:
~$ raku -ne 'BEGIN my $n; if (.chars == 0) {.put if ++$n <= 1; next}; $n = 0; .put;' file
#OR:
~$ raku -ne 'BEGIN my $n; (.chars == 0) ?? $n++ !! ($n = 0); .put if $n <= 1;' file
#OR:
~$ raku -e 'lines.join("\n").subst(:global, / \n**3..* /, "\n\n").put;' file
Above are three answers written in Raku that take a text file and collapse two-or-more blank lines into a single blank line. The first example closely follows an awk
answer already posted. The second example closely follows a perl
answer already posted. The third example uses Raku's lines
routine.
CASE 2 -- Remove single blank lines, leave multiple blank lines unchanged:
~$ raku -ne 'BEGIN my $n=0; $n=(.chars == 0) ?? $n+1 !! 0;
$n == 2 ?? "\n$_".put !! $n == 1 ?? "".print !! $_.put ;' file
#OR:
~$ raku -e 'lines.join("\n").subst(:global, / [ ^ | \N \n ] <( \n )> \N /, "")).put;' file
Above are two answers written in Raku that take a text file and remove single blank lines, while leave multiple blank lines unchanged. The first example closely follows a perl
answer already posted. The second example uses Raku's lines
routine, with a subst
itution Regex that basically recognizes \N \n\n \N
two consecutive newlines surrounded by non-newline characters (the actual pattern is more complex to eliminate a leading newline at the start of the file). Raku's <(
...)>
capture markers are used to only replace a single \n
newline within a larger recognition sequence.
NOTE: If you need to remove multiple newlines at the start/end of the file, insert a call to either trim
, trim-leading
and/or trim-trailing
.
https://docs.raku.org/language/regexes#Capture_markers:_%3C(_)%3E
https://docs.raku.org/language/regexes
https://raku.org
The posted solutions looked a little bit cryptic to me. Here is the solution in Python 3.6:
#!/usr/bin/env python3
from pathlib import Path
import sys
import fileinput
def remove_multiple_blank_lines_from_file(path, strip_right=True):
non_blank_lines_out_of_two_last_lines = [True, True]
for line in fileinput.input(str(path), inplace=True):
non_blank_lines_out_of_two_last_lines.pop(0)
non_blank_lines_out_of_two_last_lines.append(bool(line.strip()))
if sum(non_blank_lines_out_of_two_last_lines) > 0:
line_to_write = line.rstrip() + '\n' if strip_right else line
sys.stdout.write(line_to_write)
def remove_multiple_blank_lines_by_glob(rglob='*', path=Path('.'), strip_right=True):
for p in path.rglob(rglob):
if p.is_file():
try:
remove_multiple_blank_lines_from_file(p, strip_right=strip_right)
except Exception as e:
print(f"File '{p}' was not processed due the error: {e}")
if __name__ == '__main__':
remove_multiple_blank_lines_by_glob(sys.argv[1], Path(sys.argv[2]), next(iter(sys.argv[3:]), None) == '--strip-right')
You can call the functions from an interpreter or run it from the shell like:
$ ./remove_multiple_lines.py '*' /tmp/ --strip-right
-
2This is as cryptic to those who find the syntax of python horrid as the others are cryptic to those who don't know those commands. Also some of those are commented and this one is not. This is also much longer. Not saying it's bad - just saying that it's not any better wrt being cryptic.Pryftan– Pryftan2023年11月02日 12:57:14 +00:00Commented Nov 2, 2023 at 12:57
You must log in to answer this question.
Explore related questions
See similar questions with these tags.
vim
one, and was to replace blank lines with one blank line).