2
\$\begingroup\$

All line indices of lines from a text that begin with the same line prefix should be found. The prefixes and the corresponding line numbers should be returned. Streams and lambdas should be used as techniques if possible.

The format of a line is always as follows:

anyText_anyText_number_number

where anyText_anyText is the prefix, and one line has at least 7 chars.

For example:

a_b_0_1
c_d_5_8
c_d_9_3
b_d_1_12

should return {c_d=[2, 3]}.

The prefixes (here c_d) could be any text (inclusive additional underscores, but without newlines...).

This is my try:

 public static void main(final String[] args) {
 String s = """
 a_b_0_1
 c_d_5_8
 c_d_9_3
 b_d_1_12""";
 Map<String, List<Integer>> duplicates = s.lines()
 .collect(Collectors.groupingBy(l -> l.substring(0, IntStream.range(0, l.length())
 .map(i -> l.length() - i - 1)
 .filter(i -> l.charAt(i) == '_')
 .skip(1)
 .findFirst()
 .orElse(0))))
 .entrySet()
 .stream()
 .filter(e -> e.getValue()
 .size() > 1)
 .collect(Collectors.toMap(Map.Entry::getKey, e -> e.getValue()
 .stream()
 .map(e1 -> {
 LineNumberReader lnr = new LineNumberReader(new StringReader(s));
 return lnr.lines()
 .filter(l -> l.equals(e1))
 .map(l -> lnr.getLineNumber())
 .findFirst()
 .orElse(0);
 })
 .toList()));
 System.out.println(duplicates); // print all duplicates line numbers
 }

Could this code be simplified?

asked Dec 22, 2023 at 2:23
\$\endgroup\$
2
  • \$\begingroup\$ Can a prefix be as little as one character? \$\endgroup\$ Commented Dec 22, 2023 at 12:29
  • \$\begingroup\$ Hello @Reinderien, no, one prefix has at least 3 characters, for example a_b, always followed by an underscore char. \$\endgroup\$ Commented Dec 22, 2023 at 13:40

2 Answers 2

3
\$\begingroup\$

First off, the description of your code is quite unclear and leaves things out. In your comment you mention "at least 3 characters followed by an underscore", however that is not what your code does. I'm basing my review on what the code does, not the description.


I think your biggest problem is fixating on using Stream, because Java Stream lacks several features that would be helpful in this scenario, for example a zipWithIndex method and tuples in order to handle more than a single value at a time.


The part that extracts the prefix is especially convoluted. At the very least it should be extracted to its own method. If I understand it correctly, it considers everything before the last two underscores the prefix.

The most straight forward way to this would be the split the string at the underscores, remove the last two items and rejoin the items with underscores. Unfortunately Standard Java lacks methods the remove the last n elements of a Stream or a List. I don't think this can be be done sensibly with Stream. I would do something like this:

private static String prefix(String text) {
 List<String> sections = Arrays.asList(text.split("_"));
 return String.join("_", sections.subList(0, sections.size() - 2));
}

One big problem is that you only take a String with line breaks as an input, due to the convoluted way to find the line numbers with LineNumberReader in the end. An input in form of a List would be more appropriate. For this - and for reviews like this one - it would be helpful to put the central code into a well defined method.

As I suggested above, with a tuple (or alternatively a custom class, or in my case a record) this would be much easier:

public static void main(final String[] args) {
 String s = """
 a_b_0_1
 c_d_5_8
 c_d_9_3
 b_d_1_12""";
 System.out.println(findDuplicatePrefixes(s.lines().toList()));
}
private static record Line(int lineNumber, String text) {}
private static Map<String, List<Integer>> findDuplicatePrefixes(List<String> lines) {
 return IntStream.range(0, lines.size()) 
 .mapToObj(i -> new Line(i + 1, lines.get(i))) 
 .collect(groupingBy( 
 l -> prefix(l.text()), 
 mapping(Line::lineNumber, toList()) 
 )).entrySet() 
 .stream() 
 .filter(e -> e.getValue().size() > 1) 
 .collect(toMap(Entry::getKey, Entry::getValue));
}
answered Dec 22, 2023 at 14:51
\$\endgroup\$
4
  • \$\begingroup\$ What information is missing from the question? \$\endgroup\$ Commented Dec 22, 2023 at 20:28
  • \$\begingroup\$ @TobiasGrothe "Duplicate lines" should have been "duplicate prefixes". As mentioned what exactly a prefix defined as. It probably also would have made sense to explicitly mention that single entries were to be removed. \$\endgroup\$ Commented Dec 23, 2023 at 9:14
  • \$\begingroup\$ An edit suggestion would be nice to make the task clearer... \$\endgroup\$ Commented Dec 23, 2023 at 14:38
  • \$\begingroup\$ Ok, I've edited the question. \$\endgroup\$ Commented Dec 23, 2023 at 14:57
3
\$\begingroup\$

Streams and lambdas should be used as techniques if possible.

But why? Why use hammer when screw driver is so much better in bomb defusing? Look at the stream based solutions and tell me the honest words you uttered when you were told that you were given the responsibility of maintaining them.

I know the society in its later stages of capitalism does not put much weight on human suffering, but I do. I think we should consider solutions that literally hurt people as impossible. So, here is how one would implement the problem with good old loops and conditionals.

final Map<String, List<Integer>> lineNumbers = new HashMap<>();
final Pattern pattern = Pattern.compile("(.*)_\\d+_\\d+");
final String[] lines = """
 a_b_0_1
 c_d_5_8
 c_d_9_3
 b_d_1_12""".split("\n");
// Find line numbers where each prefix occurs.
for (int i = 0; i < lines.length; i++) {
 final Matcher matcher = pattern.matcher(lines[i]);
 if (matcher.matches()) {
 final String prefix = matcher.group(1);
 final List<Integer> numbers = lineNumbers
 .computeIfAbsent(prefix, p -> new ArrayList<>());
 // Line numbers are indexed from 1.
 numbers.add(i + 1);
 }
}
// Remove line numbers that only occur once.
lineNumbers.values().removeIf(numbers -> numbers.size() == 1);
System.out.println(lineNumbers);
answered Jan 9, 2024 at 7:04
\$\endgroup\$
1
  • \$\begingroup\$ Thanks, I also think, streams are here not the best choices. But I have a credo in my head that says to use streams wherever possible... Maybe just because it's the new, hip thing. ;) \$\endgroup\$ Commented Jan 9, 2024 at 11:13

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.