As an exercise I repeated this Java question, but in Go: Convert string to mixed case
The objective is for every second letter to be converted to uppercase.
Go string processing is relatively new to me, and I am looking for feedback on the use of the unicode
package, any other go language or library features I should be using, and of course, issues with style, convention, or possible bugs.
I have put the following in the playground as well.
package kata
import (
"unicode"
)
// AlternateCase will return the input string modified such that alternate letters are transformed to uppercase.
//
// Note that non-letters are ignored, so in the input string `a!b` the `!` is ignored, so `b` is the second letter.
// The result of AlternateCase on `a!b` is `A!b`.
func AlternateCase(input string) string {
runes := make([]rune, 0, len(input))
var upper bool
for _, c := range input {
if unicode.IsLetter(c) {
upper = !upper
if upper {
c = unicode.ToUpper(c)
}
}
runes = append(runes, c)
}
return string(runes)
}
I have also written some test cases, and a documentation example:
package kata
import (
"fmt"
"testing"
)
func TestAlternateCase(t *testing.T) {
cases := []struct{ input, output string }{
{"hello, world!", "HeLlO, wOrLd!"},
{"a!b", "A!b"},
{"AAAA", "AAAA"},
{"", ""},
{"h", "H"},
{"!h", "!H"},
{"日本語", "日本語"},
{"f日u本b語ar", "F日U本B語Ar"},
}
for _, c := range cases {
got := AlternateCase(c.input)
if got != c.output {
t.Errorf("For input '%v' expect '%v' but got '%v'", c.input, c.output, got)
}
}
}
func ExampleAlternateCase() {
hi := "hello, world!"
fmt.Printf("The AlternateCase of '%v' is '%v'\n", hi, AlternateCase(hi))
// Output: The AlternateCase of 'hello, world!' is 'HeLlO, wOrLd!'
}
I am looking for feedback on the test mechanisms as well.
1 Answer 1
I think both your solution and your testing are fine.
A couple of test cases I would add (but your solution also passes them):
- a
string
starting with BOM, e.g.{"\xef\xbb\xbfhi", "\xef\xbb\xbfHi"}
- a
string
which is not a valid UTF-8string
, e.g.{"h\xffi", "H\xef\xbf\xbdi"}
- a
string
with multiple invalid UTF-8 bytes, e.g.{"h\xff\xffi", "H\xef\xbf\xbd\xef\xbf\xbdi"}
Explaining the invalid UTF-8 string
s: Since you are converting your input string to rune
s, invalid UTF-8 bytes are reported as a value of 0xfffd
(the Unicode replacement character), which when encoded as UTF-8 (happens when you convert []rune
back to string
) results in a sequence of []byte{0xef, 0xbf, 0xbd}
; this is what the invalid 0xff
byte is compared to in the test case.
Performance wise I would use a string
=> []rune
conversion as you not only need the rune
s, but you also want to convert them back to a string
. Using append()
is slower, and it has to assign a slice value (a descriptor) at each rune
.
Also note that if we convert the string
to []rune
, we only need to assign (overwrite) new runes which are converted to their upper case:
This is how I would do it:
func AlternateCase(s string) string {
rs, upper := []rune(s), false
for i, r := range rs {
if unicode.IsLetter(r) {
if upper = !upper; upper {
rs[i] = unicode.ToUpper(r)
}
}
}
return string(rs)
}
Explore related questions
See similar questions with these tags.
U+65E5
is part of the Letter class (See: fileformat.info/info/unicode/char/65e5/index.htm )Character.IsLetter()
is "yes" \$\endgroup\$