Basically, this code does multiplication of chosen operations.
public static String[] gen8String(String[] pattern1, String[] pattern2){
String[] combinedSubset = new String[90]; //emty array for the subset of 90 strings
String combinedString = ""; //string holder for each combined string
int index = 0; //used for combinedSubset array
int present = 0; //used to check if all 6 characters are present
for(int i = 0; i < 15; i++){
for(int j = 0; j < 15; j++){
combinedString = pattern1[i] + pattern2[j]; //combine both 4 letter strings into 8 char length string
char[] parsedString = combinedString.toCharArray(); //parse into array
//check if all 6 characters are present
for(int k = 1; k <= 6; k++)
{
if(new String(parsedString).contains(k+"")) {
present++;
}
else
break;
//if all 6 are present, then add it to combined subset
if(present == 6)
combinedSubset[index++] = combinedString;
}
present = 0;
}
}
return combinedSubset;
}
Let's look at an example:
Let's says I have 2 strings ABCDEF
and ACDEBF
. Now before I call gen8string
, I will first perform a \${6 \choose 4}\$ deletion operation on each string. For instance, I will take ABCDEF
delete any two characters in that string and I am left with substring of length 4. How many different such strings can I have (given all the letters are unique)? \15ドル = {6 \choose 4}\$. Now, I will do the same for both strings. Now I am left with 2 string arrays, each of size 15.
In gen8String
, I am passing in the 2 above mentioned arrays and I am combining the substrings under the following constraints:
- I can only combine 2 strings to make one 8-length string (4+4).
- I can only combine them if there are no overlapping missing characters.
What do I mean by the 2nd point? Well, let's say I one of the inputs in pattern
is ABCD
and in pattern2
one of the inputs is ACDE
. These 2 substrings cannot be combined because there are overlapping missing characters, i.e. the F
. However, if I have ABCD
in pattern1
and CDEF
, these can be combined because all 6 characters are present in the 8-length string at least once. So, no overlapping missing characters. This can be seen by tracing through the code.
As a concluding note, this function essentially does a
$${{6}\choose{4}} * {{4}\choose{2}} = 90$$
How can I optimize/generalize this code? Ways to improve it?
1 Answer 1
Think about alternatives
Let's says I have 2 strings ABCDEF and ACDEBF. Now before I call gen8string, I will first perform a \$\binom 6 4\$ deletion operation on each string. For instance, I will take ABCDEF delete any two characters in that string and I am left with substring of length 4. How many different such strings can I have (given all the letters are unique)? \15ドル=\binom 6 4\$. Now, I will do the same for both strings. Now I am left with 2 string arrays, each of size 15.
Why?
You want to generate 90 strings. Why generate 255 strings and then mask out 165 of them? Why not just generate 90 strings directly?
ABEFCDEF 001122
ABDFCDEF 001212
ABDECDEF 001221
...
ABCFBDEF 020112
...
ABEFABCD 221100
The six numbers are which string has the letter. So 001122 means AB in the first string, CD in the second string, and EF in both. Iterate from 001122 to 221100 or vice versa. Then you don't need gen8string
at all, as all it does is eliminate invalid strings like 2222-- and 22210-.
Of course, this would happen outside the code that you posted.
Working with what we have here
You have
public static String[] gen8String(String[] pattern1, String[] pattern2){ String[] combinedSubset = new String[90]; //emty array for the subset of 90 strings String combinedString = ""; //string holder for each combined string int index = 0; //used for combinedSubset array int present = 0; //used to check if all 6 characters are present for(int i = 0; i < 15; i++){ for(int j = 0; j < 15; j++){ combinedString = pattern1[i] + pattern2[j]; //combine both 4 letter strings into 8 char length string char[] parsedString = combinedString.toCharArray(); //parse into array //check if all 6 characters are present for(int k = 1; k <= 6; k++) { if(new String(parsedString).contains(k+"")) { present++; } else break; //if all 6 are present, then add it to combined subset if(present == 6) combinedSubset[index++] = combinedString; } present = 0; } } return combinedSubset; }
Consider
public static String[] gen8String(String[] patterns1, String[] patterns2) {
String[] results = new String[90];
int resultCount = 0;
for (String pattern1 : patterns1) {
for (String pattern2 : patterns2) {
String combined = pattern1 + pattern2;
//check if all 6 characters are present
for (int k = 1; combined.contains(k+""); k++) {
//if all 6 are present, then add it to the results and end
if (k >= 6) {
results[resultCount++] = combined;
break;
}
}
}
}
return results;
}
This changes the names of the parameters. They are collections of patterns, not the patterns themselves.
I changed the names of combinedSubset
and index
to be clearer about how they relate to the results. This allowed me to drop the comments as obsolete.
Switching from C-style for
loops to range based for
loops gets rid of the unnecessary i
and j
variables. It also gets rid of the magic number 15.
I renamed combinedString
to combined
and moved the declaration to where it was set. Since a single value is never used across iterations of the outer two loops, the scope doesn't need to be any wider.
We don't need present
, as we already have k
. If we make it to the sixth iteration, we can update the results
and stop the loop.
If we gate the loop on the contains
check, we can just check the iterations once inside the loop. The original code checked twice on each iteration.
Note that your example suggested inputs like ABCD and CDEF. The patterns here are actually like 1234 and 3456.
Generalizing
Your code only works for four element subsets of a six element set. Consider rewriting so the results are a List
. Then you wouldn't have to give a result set size or maintain your own count of the results.
The largest value for k
(6 here) should probably be a parameter. Or pass in an array or list over which we could iterate.
I already modified the for
loops to be based on the size of the input parameters, so that's already generalized.
And of course the name would need to change.
Precalculate
Because the current results are so narrow, you could just set them manually. One ninety element array.
-
\$\begingroup\$ Your suggestion for not using
gen8string
and just computing the 90 in the first place seems very helpful. Can you please provide a code example? I am not able to see how I can just compute the 90 without masking out the 165. \$\endgroup\$user85591– user855912016年11月06日 20:01:54 +00:00Commented Nov 6, 2016 at 20:01
gen8String("ACFFG", "XXZ")
do, for example, and why? Also, is this Java? Please tag the question accordingly. \$\endgroup\$