Write a Kotlin function "commonPrefix"

Question 1

Task description: Write a function commonPrefix, which takes two strings as parameter and returns the common prefix.

Example: parameters abcdef and abc1gv would return abc.

My implementation:

fun commonPrefix(str1: String, str2: String): String {
 var sRet = ""
 var i = 0
 val length = if (str1.length >= str2.length)
 str1.length else str2.length
 while (i < length) {
 if (str1.take(i + 1) != str2.take(i + 1)) {
 return sRet
 }
 sRet += str1[i++]
 }
 return sRet
}
val first = "abcdef"
val sec = "abc1ef"
println(commonPrefix(first, sec))

Is my implementation done in a good way? Where could it be optimized?

Question 2

The first thing which stands out to me is that it's typically better to use standard functions than to roll-your-own even for trivial things. For example

val length = if (str1.length >= str2.length)
 str1.length else str2.length

could be better written as

val length = maxOf(str1.length, str2.length)

People often talk in circles about this thinking about efficiency, but that's generally a red herring. Modern compilers are pretty good at optimising, and the differences are likely to be minimal if they exist at all.

Instead, the number one reason for using a named function is readability. In the former example, I have to stop and piece together what it's actually doing. In the latter, I can see at a glance and continue to read the code fluently.

Following on from that, I can now see that it's the wrong check. Logically, a common prefix cannot be longer than the shorter of the two strings. So now that I can clearly see that the existing code is equivalent to a maxOf call, I realise it should probably instead be a minOf call.

The second thing that jumps out at me is the efficiencies of your loop. Every time you go through the loop, you are (1) comparing the whole prefix and (2) updating your return value. That is, the first time you're comparing "a" against "a" which is fine. By the third time round you're comparing "abc" against "abc", and so redoing the work of comparing "a" and "b". Then, you're taking your previous sRet object "ab", copying it and the "c" into a new string object, and assigning it back to sRet. If you're familiar with big O notation, that means that your overall function becomes O(n²) in the length of the prefix. It also means that you wind up creating and throwing away a lot of little temporary objects.

You don't need to do either. By the time you get to the third step in the loop, you already know that "ab" matches "ab" so you don't need to check again. Instead of using take to get the prefix, use get to get the actual letter. Then you can just check "c" against "c".

As to the problem of loads of throwaway sRet objects, the usual advice is to use a StringBuilder. Unlike a string, a StringBuilder is specifically made with the idea that you'll change it as you go. It's much more efficient for that purpose. However, in this case you don't need to do that because you don't need sRet at all! After all, both str1 and str2 contain the prefix you're after. You already have (and have used!) a function to get the prefix out. So just count along the two strings, comparing one character at a time. Once you know the number of characters that are the same, take it.

The final thing that I'd mention is testing. You've given one example pair of inputs to your function. That's a good thing, and it helps illustrate the problem. It's also worth thinking about what other examples are worth checking. In general for a function like this, I'd suggest it's worth testing at least the following:

str1	str2	reason
abc	xyz	It's easy to have a bug which messes up when the prefix is empty
a	b	It's easy to have a bug which messes up single character inputs
a	a	Same as above
abc	abc	It's easy to have a bug which breaks strings that are identical
abxyz	abc	You're trying to support inputs of different lengths, so this should be tested.
abc	abxyz	Same as above
abcde	abc	One string being a prefix of the other
Empty strings cause no end of problems and need to be tested
abc	Same as above

Josiah Josiah 6,1062 gold badges18 silver badges39 bronze badges · Accepted Answer · 2021-03-07 17:59:49Z

The first thing which stands out to me is that it's typically better to use standard functions than to roll-your-own even for trivial things. For example

val length = if (str1.length >= str2.length)
 str1.length else str2.length

could be better written as

val length = maxOf(str1.length, str2.length)

People often talk in circles about this thinking about efficiency, but that's generally a red herring. Modern compilers are pretty good at optimising, and the differences are likely to be minimal if they exist at all.

Instead, the number one reason for using a named function is readability. In the former example, I have to stop and piece together what it's actually doing. In the latter, I can see at a glance and continue to read the code fluently.

Following on from that, I can now see that it's the wrong check. Logically, a common prefix cannot be longer than the shorter of the two strings. So now that I can clearly see that the existing code is equivalent to a maxOf call, I realise it should probably instead be a minOf call.

The second thing that jumps out at me is the efficiencies of your loop. Every time you go through the loop, you are (1) comparing the whole prefix and (2) updating your return value. That is, the first time you're comparing "a" against "a" which is fine. By the third time round you're comparing "abc" against "abc", and so redoing the work of comparing "a" and "b". Then, you're taking your previous sRet object "ab", copying it and the "c" into a new string object, and assigning it back to sRet. If you're familiar with big O notation, that means that your overall function becomes O(n²) in the length of the prefix. It also means that you wind up creating and throwing away a lot of little temporary objects.

You don't need to do either. By the time you get to the third step in the loop, you already know that "ab" matches "ab" so you don't need to check again. Instead of using take to get the prefix, use get to get the actual letter. Then you can just check "c" against "c".

As to the problem of loads of throwaway sRet objects, the usual advice is to use a StringBuilder. Unlike a string, a StringBuilder is specifically made with the idea that you'll change it as you go. It's much more efficient for that purpose. However, in this case you don't need to do that because you don't need sRet at all! After all, both str1 and str2 contain the prefix you're after. You already have (and have used!) a function to get the prefix out. So just count along the two strings, comparing one character at a time. Once you know the number of characters that are the same, take it.

The final thing that I'd mention is testing. You've given one example pair of inputs to your function. That's a good thing, and it helps illustrate the problem. It's also worth thinking about what other examples are worth checking. In general for a function like this, I'd suggest it's worth testing at least the following:

str1	str2	reason
abc	xyz	It's easy to have a bug which messes up when the prefix is empty
a	b	It's easy to have a bug which messes up single character inputs
a	a	Same as above
abc	abc	It's easy to have a bug which breaks strings that are identical
abxyz	abc	You're trying to support inputs of different lengths, so this should be tested.
abc	abxyz	Same as above
abcde	abc	One string being a prefix of the other
Empty strings cause no end of problems and need to be tested
abc	Same as above

Stack Exchange Network

Write a Kotlin function "commonPrefix"

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Write a Kotlin function "commonPrefix"

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions