(The story continues in A simple method for compressing white space in text (Java) - Take II.)
Intro
Now I have that text space compressor. For example,
" hello world " -> "hello world"
" \n world \t hello " -> "world hello"
Code
package io.github.coderodde.text;
import java.util.Objects;
/**
* This class provides a linear time method for compressing space
* @author Rodion "rodde" Efremov
* @version 1.0.0 (Oct 30, 2025)
* @since 1.0.0 (Oct 30, 2025)
*/
public final class TextSpaceCompressor {
public static String spaceCompress(String text) {
Objects.requireNonNull(text);
StringBuilder sb = new StringBuilder();
int textLength = text.length();
int loIndex = 0;
int hiIndex = textLength - 1;
// Scan empty prefix if any:
for (; loIndex < hiIndex; ++loIndex) {
if (!Character.isWhitespace(text.charAt(loIndex))) {
break;
}
}
// Scan empty suffix is any:
for (; hiIndex > loIndex; --hiIndex) {
if (!Character.isWhitespace(text.charAt(hiIndex))) {
break;
}
}
if (loIndex == hiIndex) {
// The input text is blank:
return "";
}
boolean scanningSpaceSequence = false;
while (loIndex <= hiIndex) {
char ch = text.charAt(loIndex);
if (!Character.isWhitespace(ch)) {
sb.append(ch);
scanningSpaceSequence = false;
} else if (!scanningSpaceSequence) {
scanningSpaceSequence = true;
sb.append(' ');
}
loIndex++;
}
return sb.toString();
}
public static void main(String[] args) {
System.out.println(spaceCompress("hello world"));
System.out.println(spaceCompress(" hello world"));
System.out.println(spaceCompress("hello world "));
System.out.println(spaceCompress(" hello \t world "));
System.out.println(spaceCompress(" cat \t \t dog \n mouse "));
}
}
Output
hello world
hello world
hello world
hello world
cat dog mouse
Critique request
As always, tell me anything that comes to mind.
1 Answer 1
Simplicity
It strikes me that the simplest method to do this is to leverage the standard library. You will want to learn how to use regular expressions. Your effort will be rewarded.
public final class TextSpaceCompressor {
public static String spaceCompress(String text) {
return text.strip().replaceAll("\\s+", " ");
}
}
Now, this will also remove newlines. Your test examples indicate you're okay doing this, but if you wanted to preserve newlines you could use streams to map over the lines, perform the substitutions and then collect the string back together, joining with newlines.
import java.util.stream.Collectors;
public final class TextSpaceCompressor {
public static String spaceCompress(String text) {
return text
.lines()
.map(line -> line.strip().replaceAll("\\s+", " "))
.collect(Collectors.joining("\n"));
}
}
You might further wish to remove empty lines. For instance, "hello world \n foo \n \n bar" becoming "hello world\nfoo\nbar".
import java.util.stream.Collectors;
public final class TextSpaceCompressor {
public static String spaceCompress(String text) {
return text
.lines()
.map(line -> line.strip().replaceAll("\\s+", " "))
.filter(line -> line != "")
.collect(Collectors.joining("\n"));
}
}
Comments on your code
I note this loop:
while (loIndex <= hiIndex) { char ch = text.charAt(loIndex); if (!Character.isWhitespace(ch)) { sb.append(ch); scanningSpaceSequence = false; } else if (!scanningSpaceSequence) { scanningSpaceSequence = true; sb.append(' '); } loIndex++; }
- You have some inconsistent whitespace.
- In either branch of the conditional you set the value of
scanningSpaceSequencebut you do it in one case before appending to your string buffer, and in one case after. The order doesn't matter, so it feels odd that this is not consistent. - Since the increment of
loIndexis not conditional, this might be better suited to a for loop.
for (; loIndex <= hiIndex; loIndex++) {
char ch = text.charAt(loIndex);
if (!Character.isWhitespace(ch)) {
sb.append(ch);
scanningSpaceSequence = false;
} else if (!scanningSpaceSequence) {
sb.append(' ');
scanningSpaceSequence = true;
}
}
In your main method, it would be a good idea to create an array of strings, and then iterate over them. This will greatly facilitate adding test cases.
public static void main(String[] args) {
String[] tests = {
"hello world",
" hello world",
"hello world ",
" hello \t world ",
" cat \t \t dog \n mouse "
};
for (String test : tests) {
System.out.println(spaceCompress(test));
}
}
-
3\$\begingroup\$ The regex way is probably still faster in this case (because of many
.charAtcalls in the OP), but sometimes rewriting regex-based processing with simple string manipulation routines can pay off. Yes, it's longer and error-prone, but may help squeeze a few more cycles in a hot loop. \$\endgroup\$STerliakov– STerliakov2025年10月30日 18:38:23 +00:00Commented Oct 30 at 18:38
You must log in to answer this question.
Explore related questions
See similar questions with these tags.