16

I have some text-files I use to take notes in - just plain text, usually just using cat >> file. Occasionally I use a blank line or two (just return - the new-line character) to specify a new subject/line of thought. At the end of each session, before closing the file with Ctrl+D, I typically add lots (5-10) blank lines (return-key) just to separate the sessions.

This is obviously not very clever, but it works for me for this purpose. I do however end-up with lots and lots of unnecessary blank lines, so I'm looking for a way to remove (most of) the extra lines. Is there a Linux-command (cut, paste, grep, ...?) that could be used directly with a few options? Alternatively, does anybody have an idea for a sed, awk or perl (well in any scripting-language really, though I'd prefer sed or awk) script that would do what I want? Writing something in C++ (which I actually could do myself), just seems like overkill.

Case #1: What I need is a script/command that would remove more than two (3 or more) consecutive blank lines, and replace them with just two blank lines. Though it would be nice if it also could be tweaked to remove more than one line (2 or more) and/or replace multiple blank lines with just one blank line.

Case #2: I could also use a script/command that would remove a single blank line between two lines of text, but leave multiple blank lines as is (though removing one of the blank lines would also be acceptable).

jubilatious1
3,87310 silver badges20 bronze badges
asked Apr 17, 2013 at 10:39
2

10 Answers 10

17

Case 1:

awk '!NF {if (++n <= 2) print; next}; {n=0;print}'

Case 2:

awk '!NF {s = s 0ドル "\n"; n++; next}
 {if (n>1) printf "%s", s; n=0; s=""; print}
 END {if (n>1) printf "%s", s}'
answered Apr 17, 2013 at 11:03
1
  • Since this use case is repeated frequently, I would suggest creating a script. Commented Oct 10, 2013 at 23:43
16

You can use uniq to collapse multiple instance of blank lines into one blank line, but it will also collapse lines which contain text if they are the same and below each other.

answered Apr 17, 2013 at 10:56
0
7

Case 1:

perl -i -ane '$n=(@F==0) ? $n+1 : 0; print if $n<=2'

Case 2:

perl -i -ane '$n=(@F==0) ? $n+1 : 0; print $n==2 ? "\n$_" : $n==1 ? "" : $_ '
answered May 10, 2013 at 3:50
0
5

You can address Case #1 like this with GNU sed:

sed -r ':a; /^\s*$/ {N;ba}; s/( *\n *){2,}/\n\n/'

That is, collect empty lines in pattern space, and if there are more than three or more lines, reduce it to two lines.

To join single-spaced lines, as in Case #2, you can do it like this:

sed -r '/^ *\S/!b; N; /\n *$/!b; N; /\S *$/!b; s/\n *\n/\n/'

Or in commented form:

sed -r '
 /^ *\S/!b # non-empty line
 N # 
 /\n *$/!b # followed by empty line
 N # 
 /\S *$/!b # non-empty line
 s/\n *\n/\n/ # remove the empty line
'
answered Apr 17, 2013 at 12:46
2

Just posting because I'm surprised nobody has mentioned it:

cat -s

The -s option turns multiple consecutive blank lines into one blank line.

$ echo -e "hello\n\n\n\n\n\n\n\n\n\nworld" > hello.txt
$ cat -s hello.txt
hello
world

It's not directly customizable, but it could make it a little easier to organize your text. For instance, you could run the file through cat -s first and then change every blank line into something else:

$ cat -s hello.txt | sed "s/^[[:space:]]*$/...\n . \n.../g"
hello
...
 . 
...
world
answered Aug 4, 2024 at 2:58
1

Following Anthon's suggestion to use "uniq"...

Remove leading, trailing and duplicate blank lines.

# Get large random string.
rand_str=; while [[ ${#rand_str} -lt 40 ]]; do rand_str=$rand_str$RANDOM; done
# Add extra lines at beginning and end of stdin.
(echo $rand_str; cat; echo $rand_str) |
# Convert empty lines to random strings.
sed "s/^$/$rand_str/" |
# Remove duplicate lines.
uniq |
# Remove first and last line.
sed '1d;$d' |
# Convert random strings to empty lines.
sed "s/$rand_str//"

In one long line:

(rand_str=; while [[ ${#rand_str} -lt 40 ]]; do rand_str=$rand_str$RANDOM; done; (echo $rand_str; cat; echo $rand_str) | sed "s/^$/$rand_str/" | uniq | sed '1d;$d' | sed "s/$rand_str//")

Or just use "cat -s".

I switched from parenthesis to curly braces in order to remain in the current shell context which I assume is more efficient. Note that curly braces require semicolon after last command and need a space for separation.

# Add extra blank lines at beginning and end.
# These will be removed in final step.
{ echo; cat; echo; } |
# Replace multiple blank lines with a single blank line.
cat -s |
# Remove first and last line.
sed '1d;$d'

In a single line.

{ { echo; cat; echo; } | cat -s | sed '1d;$d'; }
answered Mar 27, 2015 at 20:59
1

This solution takes care also of the last blank lines in the file:

sed -r -n '
 /^ *$/!{p;b} # non-blank line - print and next cycle
 h # blank line - save it in hold space
 :loop
 $b end # last line - go to end
 n # read next line in pattern space
 /^ *$/b loop # blank line - loop to next one
 :end # pattern space has non-blank line or last blank line
 /^ *$/{p;b} # last blank line: print and exit
 H;x;p # non-blank line: print hold + pattern space and next cycle
'
answered Dec 16, 2016 at 15:58
1

Here is a simple way which removes multiple blank lines and replaced with single blank line; using sed, uniq, and tee commands.

$ cat myfile | uniq | tee myfile
answered Apr 17, 2020 at 18:30
2
  • Slick and it works (assuming no dup lines, like in code and config). Leaves some dups but only if the lines have whitespace chars. Commented Jan 16, 2022 at 19:50
  • This would fail as soon as the file grows to exceed the pipe buffer's size. Commented Aug 3, 2024 at 8:16
0

Using Raku (formerly known as Perl_6)

CASE 1 -- Collapse 2+ blank lines to a single blank line:

~$ raku -ne 'BEGIN my $n; if (.chars == 0) {.put if ++$n <= 1; next}; $n = 0; .put;' file
#OR:
~$ raku -ne 'BEGIN my $n; (.chars == 0) ?? $n++ !! ($n = 0); .put if $n <= 1;' file
#OR:
~$ raku -e 'lines.join("\n").subst(:global, / \n**3..* /, "\n\n").put;' file

Above are three answers written in Raku that take a text file and collapse two-or-more blank lines into a single blank line. The first example closely follows an awk answer already posted. The second example closely follows a perl answer already posted. The third example uses Raku's lines routine.


CASE 2 -- Remove single blank lines, leave multiple blank lines unchanged:

~$ raku -ne 'BEGIN my $n=0; $n=(.chars == 0) ?? $n+1 !! 0; 
 $n == 2 ?? "\n$_".put !! $n == 1 ?? "".print !! $_.put ;' file 
#OR:
~$ raku -e 'lines.join("\n").subst(:global, / [ ^ | \N \n ] <( \n )> \N /, "")).put;' file

Above are two answers written in Raku that take a text file and remove single blank lines, while leave multiple blank lines unchanged. The first example closely follows a perl answer already posted. The second example uses Raku's lines routine, with a substitution Regex that basically recognizes \N \n\n \N two consecutive newlines surrounded by non-newline characters (the actual pattern is more complex to eliminate a leading newline at the start of the file). Raku's <(...)> capture markers are used to only replace a single \n newline within a larger recognition sequence.

NOTE: If you need to remove multiple newlines at the start/end of the file, insert a call to either trim, trim-leading and/or trim-trailing.

https://docs.raku.org/language/regexes#Capture_markers:_%3C(_)%3E
https://docs.raku.org/language/regexes
https://raku.org

answered Aug 3, 2024 at 8:05
-1

The posted solutions looked a little bit cryptic to me. Here is the solution in Python 3.6:

#!/usr/bin/env python3
from pathlib import Path 
import sys 
import fileinput 
def remove_multiple_blank_lines_from_file(path, strip_right=True): 
 non_blank_lines_out_of_two_last_lines = [True, True] 
 for line in fileinput.input(str(path), inplace=True): 
 non_blank_lines_out_of_two_last_lines.pop(0) 
 non_blank_lines_out_of_two_last_lines.append(bool(line.strip())) 
 if sum(non_blank_lines_out_of_two_last_lines) > 0: 
 line_to_write = line.rstrip() + '\n' if strip_right else line 
 sys.stdout.write(line_to_write)
def remove_multiple_blank_lines_by_glob(rglob='*', path=Path('.'), strip_right=True): 
 for p in path.rglob(rglob): 
 if p.is_file(): 
 try:
 remove_multiple_blank_lines_from_file(p, strip_right=strip_right)
 except Exception as e:
 print(f"File '{p}' was not processed due the error: {e}")
if __name__ == '__main__':
 remove_multiple_blank_lines_by_glob(sys.argv[1], Path(sys.argv[2]), next(iter(sys.argv[3:]), None) == '--strip-right')

You can call the functions from an interpreter or run it from the shell like:

$ ./remove_multiple_lines.py '*' /tmp/ --strip-right
answered Jul 8, 2019 at 10:10
1
  • 2
    This is as cryptic to those who find the syntax of python horrid as the others are cryptic to those who don't know those commands. Also some of those are commented and this one is not. This is also much longer. Not saying it's bad - just saying that it's not any better wrt being cryptic. Commented Nov 2, 2023 at 12:57

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.