I have a simple program that "minimizes" a file path. Minimizing simply means replacing unnecessary file entries with entries that mean the same thing:
path1/path2/../path3
is replaced withpath1/path3
path1/./path2
is replaced withpath1/path2
I've achieved this using regex and it seems to cover all cases, but it also feels very slow and I have a hunch it might be prone to infinite loops:
private static final Pattern ONE_DOT = Pattern.compile("/\\./");
private static final Pattern TWO_DOTS = Pattern.compile("[^/]+/\\.\\./?");
public static String minimize(final String in) {
String tmp = in;
while (!stringIsMinimized(tmp)) {
tmp = ONE_DOT.matcher(tmp).replaceAll("/");
tmp = TWO_DOTS.matcher(tmp).replaceAll("");
}
return tmp;
}
public static boolean stringIsMinimized(final String str) {
return !(ONE_DOT.matcher(str).find() || TWO_DOTS.matcher(str).find());
}
2 Answers 2
Your code actually doesn't work, because you use replaceAll
. This means that your pattern will allow you to match ../../
and replace it with , resulting in a lost double-back. You can fix this two ways:
- You could change the
replaceAll
toreplaceFirst
- You could change the pattern to exclude
../../
([^/.]+/\\.\\./?
works)
You can then simplify your loop since the ONE_DOT
case will be matched properly so you don't need to split out a check for minimization.
Thus I would suggest:
private static final Pattern ONE_DOT = Pattern.compile("/\\./");
private static final Pattern TWO_DOTS = Pattern.compile("([^/.]+/\\.\\.)+/?");
public static String minimize(final String in) {
String tmp = in;
tmp = ONE_DOT.matcher(tmp).replaceAll("/");
while (TWO_DOTS.matcher(tmp).matches())
tmp = TWO_DOTS.matcher(tmp).replaceAll("");
return tmp;
}
At least as a first-pass improvement. There may be a way to do it without the loop using some kind of counting regular expression, but off the top of my head I'm not sure.
-
\$\begingroup\$ The fastest solution would probably be an iterator, a stack and some look-around. \$\endgroup\$dfhwze– dfhwze2019年06月14日 20:17:10 +00:00Commented Jun 14, 2019 at 20:17
-
\$\begingroup\$ Yeah, fastest would probably be something like
Arrays.stream(in.split("/")).filter(x -> !x.equals(".")).reduce("", {some lambda here})
\$\endgroup\$LambdaBeta– LambdaBeta2019年06月14日 20:21:25 +00:00Commented Jun 14, 2019 at 20:21
Your code looks great!
Here, maybe another option that we might exercise would be to possibly do the entire task with an expression, maybe something similar to these:
^(.+?\/).+\/(.+)$
(.+?\/).+\/(.+)
Our first capturing group is non-greedy, collects our desired path1
for both inputs, followed by a greedy .+
that'd continue upto the last slash, and our path2
and path3
are in this group: (.+)
, and our desired output can be called using 1円2円
.
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "^(.+?\\/).+\\/(.+)$";
final String string = "path1/path2/../path3\n"
+ "path1/./path2";
final String subst = "1ドル2ドル";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
final String result = matcher.replaceAll(subst);
System.out.println(result);
Demo
console.log(`path1/path2/../path3
path1/./path2`.replace(/^(.+?\/).+\/(.+)$/gm, `1ドル2ドル`));
Performance
const repeat = 1000000;
const start = Date.now();
for (var i = repeat; i >= 0; i--) {
const regex = '/^(.+?/).+/(.+)$/gm';
const str = `path1/path2/../path3`;
const subst = `1ドル2ドル`;
var match = str.replace(regex, subst);
}
const end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match 💚💚💚 ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");
RegEx Circuit
jex.im visualizes regular expressions:
"path1/..../path2/"
yields"../path2/"
\$\endgroup\$