I tried solving the following problem by following the suggested approach to solving the problem in the problem tutorial. However, the online judge still says my solution exceeds the time limit. Is there a way to optimize the program to perform faster?
The Problem:
You are given \$n\$ strings \$s_1, s_2, \dots, s_n\$ of length at most 8.
For each string \$s_i\$, determine if there exist two strings \$s_j\$ and \$s_k\$ such that \$s_i = s_j + s_k\$. That is, \$s_i\$ is the concatenation of \$s_j\$ and \$s_k\$. Note that \$j\$ can be equal to \$k\$.
Recall that the concatenation of strings \$s\$ and \$t\$ is \$s + t = s_1 s_2 \dots s_p t_1 t_2 \dots p_q\$, where \$p\$ and \$q\$ are the lengths of strings \$s\$ and \$t\$ respectively. For example, concatenation of "code" and "forces" is "codeforces".
Input
The first line contains a single integer \$t\$ (\1ドル \leq t \leq 10^4\$) — the number of test cases.
The first line of each test case contains a single integer \$n\$ (\1ドル \leq n \leq 10^5\$) — the number of strings.
Then \$n\$ lines follow, the \$i\$-th of which contains non-empty string \$s_i\$ of length at most 8, consisting of lowercase English letters. Among the given \$n\$ strings, there may be duplicates.
The sum of \$n\$ over all test cases doesn't exceed \10ドル^5\$.
Output
For each test case, output a binary string of length \$n\$. The \$i\$-th bit should be 1 if there exist two strings \$s_j\$ and \$s_k\$ where \$s_i = s_j + s_k\$, and 0 otherwise. Note that \$j\$ can be equal to \$k\$.
Example
Input:
3 5 abab ab abc abacb c 3 x xx xxx 8 codeforc es codes cod forc forces e code
Output:
10100 011 10100101
The Problem Tutorial (Hint):
Use some data structure that allows you to answer queries of the form: "does the string \$t\$ appear in the array \$s_1, \dots, s_n\$?" For example, in C++ you can use a
map<string, bool>
, while in Python you can use a dictionary.Afterwards, for each string \$s\$, brute force all strings \$x\$ and \$y\$ such that \$s=x+y\$. There are at most 7 such strings, because \$s\$ has length at most 8. Then check if both \$x\$ and \$y\$ appear in the array using your data structure.
The time complexity is \$O(l n \log n)\$ per test case, where \$l\$ is the maximum length of an input string.
My Solution
#include <iostream>
#include <string>
#include <unordered_map>
using namespace std;
int main()
{
int tests;
cin >> tests;
for (int i = 0; i < tests; i++) { //one loop per each test case
int NumOfStrings;
cin >> NumOfStrings;
unordered_map<string, bool> phrases;
string words[100000];
for (int j = 0; j < NumOfStrings; j++) { // adds all input strings for each test case into words and phrases
string temp;
cin >> temp;
phrases.insert({ temp, false });
words[j] = temp;
}
for (int k = 0; k < NumOfStrings; k++) { // loop for each string in the test case
int len = words[k].length(); // len is length of current word
for (int o = 0; o < len; o++) { // loops through current word
string partition1 = "";
string partition2 = "";
for (int m = 0; m <= o; m++) {
partition1 += words[k][m];
}
for (int n = o+1; n < len; n++) {
partition2 += words[k][n];
}
if (phrases.find(partition1) != phrases.end() && phrases.find(partition2) != phrases.end()) { //if both substrings exist in the map, then value for word is 1.
phrases[words[k]] = true;
}
}
}
for (int j = 0; j < NumOfStrings; j++) {
int value = 0;
if (phrases[words[j]] == true) value = 1;
cout << value;
}
cout << endl;
}
return 0;
}
-
\$\begingroup\$ I thought code review SE was for code that already works as intended. if it doesn't it probably belongs on SO \$\endgroup\$qwr– qwr2022年08月15日 03:26:48 +00:00Commented Aug 15, 2022 at 3:26
-
\$\begingroup\$ @qwr Too inefficient (time or space) is the one failure we accept here. \$\endgroup\$Deduplicator– Deduplicator2022年08月15日 06:15:57 +00:00Commented Aug 15, 2022 at 6:15
2 Answers 2
You are failing with the datastructures.
What you need is:
A string to mark your answers.
Start with all no.A
std::vector
to collect your strings.A way to identify all candidates for the prefix fast.
A sortedstd::vector
will do.- You might want to get fancy when iterating it to identify candidates even faster.
- Don't forget to include the id to mark your answers.
A way to test existence of a matching suffix.
A hashset will do fabulously.
And for the love of all, don't duplicate strings for the two datastructures, use std::string_view
.
As you only use it once it's not that bad, but you really don't need the manual flush in std::endl
. And if you did need it, you should still be explicit and use std::flush
.
-
\$\begingroup\$ Now I'm questioning my beliefs whether
unordered_set
is almost always faster thanset
. I had a solution usingunordered_set
rejected with "time limit exceeded" verdict, but the one usingset
was accepted. I packed strings intounsigned long long
integers and used bit operations. \$\endgroup\$panik– panik2022年08月16日 10:44:03 +00:00Commented Aug 16, 2022 at 10:44
Move solution into a separate function
Doing this greatly improves code readability.
void solve() {
int NumOfStrings;
/* ... */
}
int main() {
int t;
cin >> t;
while (t--)
solve();
return 0;
}
Be careful about using resources
On each iteration, we do a lot of redundant work by creating and destroying 100k strings.
for (int i = 0; i < tests; i++) { //one loop per each test case
string words[100000];
}
Presumably, heap allocation is not involved due to the small string optimization.
To optimize memory usage, make the array static
.
for (int i = 0; i < tests; i++) { //one loop per each test case
static string words[100000];
}
Misleading hint
Before checking the author's solution, I didn't quite understand the purpose of using map
.
Let's find it out.
void solve() {
/* ... */
map<string, bool> mp;
for (int i = 0; i < n; i++) {
cin >> s[i];
mp[s[i]] = true; // map input string to true
}
for (int i = 0; i < n; i++) {
/* ... */
/* mp[pref] adds pair {pref, false} to mp
if pref is not presented and returns false;
otherwise, returns true if pref is an input string */
if (mp[pref] && mp[suff]) { ok = true; }
/* ... */
}
The author's map
gradually growths, but only input strings are mapped to true
.
Looks like the hint was misinterpreted, which is not surprising because it had been taken out of the context.
for (int k = 0; k < NumOfStrings; k++) { // loop for each string in the test case
/* ... */
if (phrases.find(partition1) != phrases.end() && phrases.find(partition2) != phrases.end()) { //if both substrings exist in the map, then value for word is 1.
phrases[words[k]] = true;
}
}
}
for (int j = 0; j < NumOfStrings; j++) {
int value = 0;
if (phrases[words[j]] == true) value = 1;
cout << value;
}
Needlessly changing mapped values to check them later in the separate loop, is a lot of redundant work.
Searching for partitions is the only crucial part here, which may be performed with a set instead of map
.
In conclusion
It suffices to apply the proposed improvements to make your solution accepted.
#include <iostream>
#include <string>
#include <unordered_set>
using namespace std;
void solve()
{
int NumOfStrings;
cin >> NumOfStrings;
unordered_set<string> phrases;
static string words[100000];
for (int j = 0; j < NumOfStrings; j++) { // adds all input strings for each test case into words and phrases
string temp;
cin >> temp;
phrases.insert(temp);
words[j] = temp;
}
for (int k = 0; k < NumOfStrings; k++) { // loop for each string in the test case
int len = words[k].length(); // len is length of current word
bool ok = false;
for (int o = 0; o < len; o++) { // loops through current word
string partition1 = "";
string partition2 = "";
for (int m = 0; m <= o; m++) {
partition1 += words[k][m];
}
for (int n = o+1; n < len; n++) {
partition2 += words[k][n];
}
if (phrases.find(partition1) != phrases.end() && phrases.find(partition2) != phrases.end()) { //if both substrings exist in the map, then value for word is 1.
ok = true;
break;
}
}
cout << ok;
}
cout << endl;
}
int main()
{
int tests;
cin >> tests;
while (tests--)
solve();
return 0;
}
Explore related questions
See similar questions with these tags.