Finding Pattern score

Question 1

You are given 3 strings: text, pre_text and post_text. Let L be a substring of text. For each substring L of text, we define pattern_score as follows:

pre_text_pattern_score = highest n, such that first n characters of L are equal to the last n characters of pre_text and occur in the same exact order.

post_text_pattern_score = highest n such that last n characters of L are equal to the first n characters of post_text and occur in the same exact order.

pattern_score = pre_text_pattern_score + post_text_pattern_score. For example, if L = "nothing", pre_text = "bruno", and post_text = "ingenious", then

pre_text_pattern_score of L is 2 because the substring "no" is matched, and

post_text_pattern_score is 3 because the substring "ing" is matched.

pattern_score is 5 = 2 + 3 Your program should find a non-empty substring of text that maximizes pattern_score.

If there is a tie, return the substring with the maximal pre_text_pattern_score.

If multiple answers still have a tied score, return the answer that comes first lexicographically. Complete the definition of function string calculateScore(string text, string prefix,string suffix)

Constraints:

text, pre_text, and post_text contain only lowercase letters ('a' - 'z')

1 <= |text| <= 50

1 <= |pre-text| <= 50

1 <= |post-text| <= 50 (where |S| denotes the number of characters in string S)

Sample case #1 text: "nothing" prefix: "bruno" suffix: "ingenious" Returns: "noing"

Sample case #2 text: "ab" prefix: "b" suffix: "a" Returns: "a"

Could you please review this implementation and let me know is there any improvement I can do in this implementation?

#include<iostream>
#include<string>
#include<vector>
#include<regex>
void tokenize(std::string & text, std::vector<std::string>& text_prefix_tokens, std::vector<std::string>& text_suffix_tokens )
{
 std::string text_copy(text);
 int length = text_copy.length();
 int i = 1;
 while( i <= length ) 
 {
 text_prefix_tokens.push_back(text_copy.substr(0, i));
 ++i;
 } 
 while( length-- ) 
 {
 text_suffix_tokens.push_back(text_copy.substr(length)); 
 } 
}
std::string findHighestPattern(std::string & pre_text, std::string & post_text, std::string & text)
{
 std::vector<std::string> text_prefix_tokens;
 std::vector<std::string> text_suffix_tokens;
 tokenize(text, text_prefix_tokens, text_suffix_tokens);
 std::smatch matches;
 int post_text_score = 0;
 std::string post_result;
 for( auto &x : text_suffix_tokens)
 {
 std::regex pattern (x+"(.*)"); 
 std::regex_match(post_text,matches,pattern);
 if(matches.size() > 0)
 { 
 int post_length = x.length();
 if(post_length > post_text_score)
 {
 post_text_score = post_length;
 post_result = x;
 }
 }
 } 
 int pre_text_score = 0;
 std::string pre_result;
 for( auto &x : text_prefix_tokens)
 {
 std::regex pattern ("(.*)"+x);
 std::regex_match(pre_text, matches, pattern);
 if(matches.size() > 0)
 {
 int pre_length = x.length();
 if(pre_length > pre_text_score)
 {
 pre_text_score = x.length();
 pre_result = x;
 }
 }
 }
 std::string result = pre_result + post_result;
 if(result.length() == 0)
 { 
 text_suffix_tokens.insert(std::end(text_suffix_tokens), std::begin(text_prefix_tokens), std::end(text_prefix_tokens));
 std::sort(std::begin(text_suffix_tokens),std::end(text_suffix_tokens));
 result = text_suffix_tokens.at(0);
 } 
 return result;
}
int main()
{ 
 std::string pre_text ("bruno");
 std::string text("nothing");
 std::string post_text("ingenious");
 //std::string pre_text ("b");
 //std::string text("ab");
 //std::string post_text("a");
 std::cout<< findHighestPattern(pre_text,post_text,text) <<"\n";
}

Question 2

There are at least a few places where you seem to be doing unnecessary extra work:

First, even if not related to the sentence above, if you don't intend to modify a variable, pass it by const reference, not by simple reference.
The most obvious unnecessary work is text_copy, even the name is a hint:
```
void tokenize(std::string & text, std::vector<std::string>& text_prefix_tokens, std::vector<std::string>& text_suffix_tokens )
{
 std::string text_copy(text);
 // do stuff with text_copy...
}
```
If you only need the copy of a parameter, then don't pass it by reference, not by const reference before copying it, simply pass it by copy:
```
void tokenize(std::string text, std::vector<std::string>& text_prefix_tokens, std::vector<std::string>& text_suffix_tokens )
{
 // do stuff with text...
}
```
That way, if someones passes a temporary std::string, then the move constructor of std::string will be called instead of its copy constructor. If you use a reference then make a copy, then the copy constructor will always be called, never the move constructor.

Now, let's have a look at this piece of code:

if(result.length() == 0)
{ 
 text_suffix_tokens.insert(std::end(text_suffix_tokens), std::begin(text_prefix_tokens), std::end(text_prefix_tokens));
 std::sort(std::begin(text_suffix_tokens),std::end(text_suffix_tokens));
 result = text_suffix_tokens.at(0);
}

First, I noticed that you sort text_suffix_tokens then take its first element. Since you never use text_suffix_tokens after that, it means that you did that to find the min element of text_suffix_tokens. If it is the case, then just do so:

if(result.length() == 0)
{ 
 text_suffix_tokens.insert(std::end(text_suffix_tokens), std::begin(text_prefix_tokens), std::end(text_prefix_tokens));
 result = *std::min_element(std::begin(text_suffix_tokens),std::end(text_suffix_tokens));
}

Then I noticed that you inserted all of text_prefix_tokens at the end of text_suffix_tokens. So in other words, what you actually want to do is find the min element of text_prefix_tokens and text_suffix_tokens concatenated. However, concatenating is a bit expensive for what you're trying to do. Therefore, you could simply write this instead:

if(result.length() == 0)
{ 
 result = std::min(
 *std::min_element(std::begin(text_prefix_tokens),std::end(text_prefix_tokens)),
 *std::min_element(std::begin(text_suffix_tokens),std::end(text_suffix_tokens))
 );
}

Also, I have a remark about your spacing: it is too inconsistent. For example you sometimes put spaces after a comma, sometimes not; other times you put some when opening a brace but don't put any when clising it, etc... Your code would definitely be more pleasant to read if its style was more consistent.

Question 3

Wow so algorithm turned from O(nlogn) to O(n)!!

Question 4

I will work on the spacing too

Question 5

I don't think this is a correct solution of the problem. I run your program and got "noing" as output, which is not a substring. And the sample case 1 is not correct. If the inputs are ("onothinging", "bruno", "ingenious"), it should output "nothing".

The problem asks to find a substring L in text. What your program did is to calculate the pattern_score of text, and output the prefix & suffix that contribute to the score.

Morwenn Morwenn 20.2k3 gold badges69 silver badges132 bronze badges · Accepted Answer · 2015-04-03 12:35:09Z

There are at least a few places where you seem to be doing unnecessary extra work:

First, even if not related to the sentence above, if you don't intend to modify a variable, pass it by const reference, not by simple reference.
The most obvious unnecessary work is text_copy, even the name is a hint:
```
void tokenize(std::string & text, std::vector<std::string>& text_prefix_tokens, std::vector<std::string>& text_suffix_tokens )
{
 std::string text_copy(text);
 // do stuff with text_copy...
}
```
If you only need the copy of a parameter, then don't pass it by reference, not by const reference before copying it, simply pass it by copy:
```
void tokenize(std::string text, std::vector<std::string>& text_prefix_tokens, std::vector<std::string>& text_suffix_tokens )
{
 // do stuff with text...
}
```
That way, if someones passes a temporary std::string, then the move constructor of std::string will be called instead of its copy constructor. If you use a reference then make a copy, then the copy constructor will always be called, never the move constructor.

Now, let's have a look at this piece of code:

if(result.length() == 0)
{ 
 text_suffix_tokens.insert(std::end(text_suffix_tokens), std::begin(text_prefix_tokens), std::end(text_prefix_tokens));
 std::sort(std::begin(text_suffix_tokens),std::end(text_suffix_tokens));
 result = text_suffix_tokens.at(0);
}

First, I noticed that you sort text_suffix_tokens then take its first element. Since you never use text_suffix_tokens after that, it means that you did that to find the min element of text_suffix_tokens. If it is the case, then just do so:

if(result.length() == 0)
{ 
 text_suffix_tokens.insert(std::end(text_suffix_tokens), std::begin(text_prefix_tokens), std::end(text_prefix_tokens));
 result = *std::min_element(std::begin(text_suffix_tokens),std::end(text_suffix_tokens));
}

Then I noticed that you inserted all of text_prefix_tokens at the end of text_suffix_tokens. So in other words, what you actually want to do is find the min element of text_prefix_tokens and text_suffix_tokens concatenated. However, concatenating is a bit expensive for what you're trying to do. Therefore, you could simply write this instead:

if(result.length() == 0)
{ 
 result = std::min(
 *std::min_element(std::begin(text_prefix_tokens),std::end(text_prefix_tokens)),
 *std::min_element(std::begin(text_suffix_tokens),std::end(text_suffix_tokens))
 );
}

Also, I have a remark about your spacing: it is too inconsistent. For example you sometimes put spaces after a comma, sometimes not; other times you put some when opening a brace but don't put any when clising it, etc... Your code would definitely be more pleasant to read if its style was more consistent.

\$\begingroup\$ Wow so algorithm turned from O(nlogn) to O(n)!! \$\endgroup\$

Steephen
– Steephen

2015年04月03日 12:46:43 +00:00
Commented Apr 3, 2015 at 12:46
\$\begingroup\$ I will work on the spacing too \$\endgroup\$

Steephen
– Steephen

2015年04月03日 15:12:26 +00:00
Commented Apr 3, 2015 at 15:12

Stack Exchange Network

Finding Pattern score

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Finding Pattern score

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions