Task description:
Write a function commonPrefix
, which takes two strings as parameter and returns the common prefix.
Example: parameters abcdef
and abc1gv
would return abc
.
My implementation:
fun commonPrefix(str1: String, str2: String): String {
var sRet = ""
var i = 0
val length = if (str1.length >= str2.length)
str1.length else str2.length
while (i < length) {
if (str1.take(i + 1) != str2.take(i + 1)) {
return sRet
}
sRet += str1[i++]
}
return sRet
}
val first = "abcdef"
val sec = "abc1ef"
println(commonPrefix(first, sec))
Is my implementation done in a good way? Where could it be optimized?
1 Answer 1
The first thing which stands out to me is that it's typically better to use standard functions than to roll-your-own even for trivial things. For example
val length = if (str1.length >= str2.length)
str1.length else str2.length
could be better written as
val length = maxOf(str1.length, str2.length)
People often talk in circles about this thinking about efficiency, but that's generally a red herring. Modern compilers are pretty good at optimising, and the differences are likely to be minimal if they exist at all.
Instead, the number one reason for using a named function is readability. In the former example, I have to stop and piece together what it's actually doing. In the latter, I can see at a glance and continue to read the code fluently.
Following on from that, I can now see that it's the wrong check. Logically, a common prefix cannot be longer than the shorter of the two strings. So now that I can clearly see that the existing code is equivalent to a maxOf
call, I realise it should probably instead be a minOf
call.
The second thing that jumps out at me is the efficiencies of your loop. Every time you go through the loop, you are (1) comparing the whole prefix and (2) updating your return value. That is, the first time you're comparing "a" against "a" which is fine. By the third time round you're comparing "abc" against "abc", and so redoing the work of comparing "a" and "b". Then, you're taking your previous sRet
object "ab", copying it and the "c" into a new string object, and assigning it back to sRet
. If you're familiar with big O notation, that means that your overall function becomes O(n2) in the length of the prefix. It also means that you wind up creating and throwing away a lot of little temporary objects.
You don't need to do either. By the time you get to the third step in the loop, you already know that "ab" matches "ab" so you don't need to check again. Instead of using take
to get the prefix, use get
to get the actual letter. Then you can just check "c" against "c".
As to the problem of loads of throwaway sRet
objects, the usual advice is to use a StringBuilder. Unlike a string, a StringBuilder is specifically made with the idea that you'll change it as you go. It's much more efficient for that purpose. However, in this case you don't need to do that because you don't need sRet
at all! After all, both str1
and str2
contain the prefix you're after. You already have (and have used!) a function to get the prefix out. So just count along the two strings, comparing one character at a time. Once you know the number of characters that are the same, take
it.
The final thing that I'd mention is testing. You've given one example pair of inputs to your function. That's a good thing, and it helps illustrate the problem. It's also worth thinking about what other examples are worth checking. In general for a function like this, I'd suggest it's worth testing at least the following:
str1 | str2 | reason |
---|---|---|
abc | xyz | It's easy to have a bug which messes up when the prefix is empty |
a | b | It's easy to have a bug which messes up single character inputs |
a | a | Same as above |
abc | abc | It's easy to have a bug which breaks strings that are identical |
abxyz | abc | You're trying to support inputs of different lengths, so this should be tested. |
abc | abxyz | Same as above |
abcde | abc | One string being a prefix of the other |
Empty strings cause no end of problems and need to be tested | ||
abc | Same as above |