1
$\begingroup$

The naive algorithm for generating all unique substrings involves traversing all the string from 1..n, thus takes $\mathcal{O}(n^2)$, but it also involves maintaining a list for all generated substring to check everytime for repetitions.

So the total time can't be $\mathcal{O}(n^2)$, since there is an added checking of uniqueness, which means that I have to maintain a list of all previous generated substrings.

So how is it correct that when it is said that the algorithm takes only $\mathcal{O}(n^2)$ time.

There is obviously an added cost at each substring generation.

The actual time complexity should be as pointed out by Raphael should be something like the following -:

$\mathcal{O}(n^2) + (\mathcal{O}(m^2) * \mathcal{O}(n)))$

where $m$ is the total number of substrings generated $n(n+1)/2$, and the time to compare 2 different strings is the smaller length between two strings, assuming the strings are being compared character by character.

Now if the above is correct then what I am confused about is should it be plus or a multiply with the comparision cost.

asked Sep 19, 2017 at 9:48
$\endgroup$
1

1 Answer 1

3
$\begingroup$

To be blunt, the Stack Overflow question and its answers illustrate why you want to ask about such things here.

  1. The propsed algorithm clearly does not run in quadratic time.
  2. Since the length $n$ of the original string is our input size, and the length of the longest possible result string, we can not assume that string comparisons run in constant time.
  3. The bound you give is therefore also wrong, since you can not check in constant time if any string is in a list of size $L$.

Instead, the running time of the algorithm is more complicated and depends on several factors (new variable names).

  • The length $N$ of the input string.
  • The number $m$ of output strings. $m \in \Omega(n) \cap O(n^2)$.
  • The total length $M$ of all output strings. $M \in \Omega(n^2) \cap O(n^3)$.

You should be able to derive rough bounds using standard approaches. Keep in mind to fix the dictionary implementation you use for the "list"! Using, say, tries you get lookup (and insertion) costs bounded by the length of the string at hand.

answered Sep 19, 2017 at 11:38
$\endgroup$
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.