6
\$\begingroup\$

I would like a review of:

  • Correctness, especially around edge/corner cases
  • Efficiency
  • Readability and/or style
  • Appropriate use of Java idioms and data structures

In particular:

  • Have I missed typical idioms/patterns useful to solving the problem?
  • Could I have reused existing/popular framework/library code?

Less interesting to me:

  • Feedback on the test harness logic in main() (I could have used JUnit)

A glance at program output from online execution

I have rewritten the code below, including feedback, as Rev2 @ Find common "characters" in 2 given strings (rev2), follow the link to look at the updated version

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;
import java.util.Set;
import java.util.function.Function;
import java.util.stream.Collectors;
class CommonCharacters {
 @SuppressWarnings("boxing")
 private static String commonCharactersOf(String string1, String string2) {
 // Requirement
 //
 // Always return lowercase versions of common characters. e.g.:
 //
 // OK: (a, a) -> a
 // OK: (a, A) -> a
 // OK: (A, A) -> a
 // No: (A, A) -> A
 // No: (aA, aA) -> aA
 //
 // Requirement
 //
 // Return common characters joined in a String, preserving the order in
 // which they appeared in the longest argument, or in the first argument if
 // the arguments are of the same length.
 //
 // Requirement
 //
 // Handle "characters" (i.e. code points) outside the Basic Multilingual
 // Plane (BMP), including characters from Supplementary Planes.
 // There should be no `char' or `Character' based "false positives". i.e.:
 //
 // String string1 = "\uD835\uDC00", string2 = "\uD835\uDC01";
 //
 // string1 and string2 share no characters in the intended acceptation of
 // "character".
 String shortestArgument, longestArgument;
 if (string1.length() < string2.length()) {
 shortestArgument = string1;
 longestArgument = string2;
 } else {
 shortestArgument = string2;
 longestArgument = string1;
 }
 // @formatter:off
 Set<Integer> codePointsSeen =
 shortestArgument.codePoints()
 .boxed()
 .map(Character::toLowerCase)
 .collect(Collectors.toSet());
 List<Integer> codePointsInCommon = new ArrayList<>();
 for (Iterator<Integer> iterator = longestArgument.codePoints()
 .distinct()
 .iterator();
 iterator.hasNext() &&
 codePointsInCommon.size() < codePointsSeen.size();) {
 // @formatter:on
 Integer codePoint = iterator.next();
 int lowerCaseCodePoint = Character.toLowerCase(codePoint);
 if (codePointsSeen.contains(lowerCaseCodePoint)) {
 codePointsInCommon.add(lowerCaseCodePoint);
 }
 }
 StringBuilder stringBuilder = new StringBuilder();
 for (Integer codePoint : codePointsInCommon) {
 stringBuilder.appendCodePoint(codePoint);
 }
 return stringBuilder.toString();
 }
 @SuppressWarnings("boxing")
 public static void main(String[] args) {
 // @formatter:off
 String[][] testArgumentsAndExpectedValues = {
 { "" , "" , "" },
 { "a" , "" , "" },
 { "" , "a" , "" },
 { "aa" , "" , "" },
 { "" , "aa" , "" },
 { "a" , "a" , "a" },
 { "aa" , "b" , "" },
 { "b" , "aa" , "" },
 { "ab" , "ba" , "ab" },
 { "aba" , "ab" , "ab" },
 { "aba" , "ba" , "ab" },
 { "aba" , "aab" , "ab" },
 { "a" , "A" , "a" },
 { "A" , "a" , "a" },
 { "A" , "A" , "a" },
 { "ab" , "AB" , "ab" },
 { "AB" , "ab" , "ab" },
 { "aB" , "Ab" , "ab" },
 { "aB" , "Ba" , "ab" },
 { "aB" , "Ba" , "ab" },
 { "a" , "\uD835\uDC1A" , "" },
 { "\uD835\uDC1A" , "\uD835\uDC1A" , "\uD835\uDC1A" },
 { "\uD835\uDC00" , "\uD835\uDC00" , "\uD835\uDC00" },
 { "\uD835\uDC1A" , "\uD835\uDC00" , "" },
 { "\uD835\uDC00" , "\uD835\uDC01" , "" },
 { "\uD801\uDC2B" , "\uD801\uDC2B" , "\uD801\uDC2B" },
 { "\uD801\uDC03" , "\uD801\uDC03" , "\uD801\uDC2B" },
 { "\uD801\uDC2B" , "\uD801\uDC03" , "\uD801\uDC2B" },
 { "\uD83D\uDE80" , "\uD83D\uDE80" , "\uD83D\uDE80" },
 { "a" , "aaaaaaaaaaaaaaaaa" , "a" },
 // The last test should still work, and work fast, with a second
 // argument string starting with "a" and ending _many_ characters later
 // The last test values doe not test it, but illustrate the scenario
 };
 int maximumTestArgumentLength =
 Arrays.stream(testArgumentsAndExpectedValues)
 .flatMap(testValues -> Arrays.stream(testValues)
 .limit(2))
 .mapToInt(String::length)
 .max()
 .getAsInt();
 // @formatter:on
 int maximumQuotedTestArgumentLength = maximumTestArgumentLength + 2;
 for (int i = 0; i < testArgumentsAndExpectedValues.length; i++) {
 String[] currentTestValues = testArgumentsAndExpectedValues[i];
 String string1 = currentTestValues[0];
 String string2 = currentTestValues[1];
 String expectedResult = currentTestValues[2];
 String actualResult = commonCharactersOf(string1, string2);
 boolean testSuccessful = expectedResult.equals(actualResult);
 if (testSuccessful) {
 // continue; // TODO: uncomment to filter out successful tests
 }
 Function<String, String> quoteString = s -> '"' + s + '"';
 // @formatter:off
 String outputFormat = "%2d) "
 + "%5s: "
 + "(%-" + maximumQuotedTestArgumentLength + "s"
 + " , "
 + "%-" + maximumQuotedTestArgumentLength + "s)"
 + " -> "
 + "%s "
 + "(%s)"
 + "%n";
 System.out.printf(outputFormat,
 i,
 testSuccessful ? "Success" : "Failure",
 quoteString.apply(string1),
 quoteString.apply(string2),
 quoteString.apply(actualResult),
 quoteString.apply(expectedResult));
 // @formatter:on
 }
 }
}
asked Oct 3, 2015 at 18:43
\$\endgroup\$
4
  • \$\begingroup\$ What do you mean by "common characters"? Sequence(s) of characters? Duplicates are allowed? \$\endgroup\$ Commented Oct 3, 2015 at 20:10
  • \$\begingroup\$ No duplicates. (Look at my sample test cases...) \$\endgroup\$ Commented Oct 3, 2015 at 20:12
  • 1
    \$\begingroup\$ Your cases are quite limited. "aabccc" and "aaccc" gives "aaccc" or "ccc"? \$\endgroup\$ Commented Oct 3, 2015 at 21:13
  • \$\begingroup\$ "aabccc", "aaccc" -> "ac" \$\endgroup\$ Commented Oct 3, 2015 at 21:50

1 Answer 1

4
\$\begingroup\$

I like that you use Streams. But I think you should use even more of their functionality:

  • You could have used map to get the lower case for longestArgument. Or maybe it would have been simpler to just use longestArgument.toLowerCase(), and same for shortestArgument.

  • You use distinct() to remove duplicate, but you could have gotten rid of the main for-loop entirely by also using filter(...) where the filter uses a predicate built from the values in shortestArgument.

  • For the final output, you can use codePointStream.toArray() to create an int[] and then use String(codePointArray, 0, codePointArray.length). You could instead use intStream.collect(...) to perform a reduction which would fill and return a StringBuilder, but it would end up quite ugly.

You spend many lines defining the shortest/longest strings, but I don't see how that is relevant. You could just assign any string to any of those and the algorithm would still work.

I might have written the program a bit differently. I would define a function that takes a string and returns the lower case version with no duplicate, in the same order, as an IntStream. I would apply that function at the start on both input strings. And then apply filter on one of them, with the predicate built from the values of the other.

answered Oct 3, 2015 at 23:59
\$\endgroup\$
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.