Alternate letters to UpperCase

Question 1

As an exercise I repeated this Java question, but in Go: Convert string to mixed case

The objective is for every second letter to be converted to uppercase.

Go string processing is relatively new to me, and I am looking for feedback on the use of the unicode package, any other go language or library features I should be using, and of course, issues with style, convention, or possible bugs.

I have put the following in the playground as well.

package kata
import (
 "unicode"
)
// AlternateCase will return the input string modified such that alternate letters are transformed to uppercase.
//
// Note that non-letters are ignored, so in the input string `a!b` the `!` is ignored, so `b` is the second letter.
// The result of AlternateCase on `a!b` is `A!b`.
func AlternateCase(input string) string {
 runes := make([]rune, 0, len(input))
 var upper bool
 for _, c := range input {
 if unicode.IsLetter(c) {
 upper = !upper
 if upper {
 c = unicode.ToUpper(c)
 }
 }
 runes = append(runes, c)
 }
 return string(runes)
}

I have also written some test cases, and a documentation example:

package kata
import (
 "fmt"
 "testing"
)
func TestAlternateCase(t *testing.T) {
 cases := []struct{ input, output string }{
 {"hello, world!", "HeLlO, wOrLd!"},
 {"a!b", "A!b"},
 {"AAAA", "AAAA"},
 {"", ""},
 {"h", "H"},
 {"!h", "!H"},
 {"日本語", "日本語"},
 {"f日u本b語ar", "F日U本B語Ar"},
 }
 for _, c := range cases {
 got := AlternateCase(c.input)
 if got != c.output {
 t.Errorf("For input '%v' expect '%v' but got '%v'", c.input, c.output, got)
 }
 }
}
func ExampleAlternateCase() {
 hi := "hello, world!"
 fmt.Printf("The AlternateCase of '%v' is '%v'\n", hi, AlternateCase(hi))
 // Output: The AlternateCase of 'hello, world!' is 'HeLlO, wOrLd!'
}

I am looking for feedback on the test mechanisms as well.

Question 2

Could you explain the last two test examples that involve Chinese characters? Are the Chinese characters being included in the case alternation? I am especially wondering about the one with fubar -> FUBAr.

Question 3

Example taken from the lower part of the page: blog.golang.org/strings Note that unicode U+65E5 is part of the Letter class (See: fileformat.info/info/unicode/char/65e5/index.htm ) Character.IsLetter() is "yes"

Question 4

I think both your solution and your testing are fine.

A couple of test cases I would add (but your solution also passes them):

a string starting with BOM, e.g. {"\xef\xbb\xbfhi", "\xef\xbb\xbfHi"}
a string which is not a valid UTF-8 string, e.g. {"h\xffi", "H\xef\xbf\xbdi"}
a string with multiple invalid UTF-8 bytes, e.g. {"h\xff\xffi", "H\xef\xbf\xbd\xef\xbf\xbdi"}

Explaining the invalid UTF-8 strings: Since you are converting your input string to runes, invalid UTF-8 bytes are reported as a value of 0xfffd (the Unicode replacement character), which when encoded as UTF-8 (happens when you convert []rune back to string) results in a sequence of []byte{0xef, 0xbf, 0xbd}; this is what the invalid 0xff byte is compared to in the test case.

Performance wise I would use a string => []rune conversion as you not only need the runes, but you also want to convert them back to a string. Using append() is slower, and it has to assign a slice value (a descriptor) at each rune.

Also note that if we convert the string to []rune, we only need to assign (overwrite) new runes which are converted to their upper case:

This is how I would do it:

func AlternateCase(s string) string {
 rs, upper := []rune(s), false
 for i, r := range rs {
 if unicode.IsLetter(r) {
 if upper = !upper; upper {
 rs[i] = unicode.ToUpper(r)
 }
 }
 }
 return string(rs)
}

icza icza 9667 silver badges13 bronze badges · Accepted Answer · 2016-05-03 08:44:19Z

I think both your solution and your testing are fine.

A couple of test cases I would add (but your solution also passes them):

a string starting with BOM, e.g. {"\xef\xbb\xbfhi", "\xef\xbb\xbfHi"}
a string which is not a valid UTF-8 string, e.g. {"h\xffi", "H\xef\xbf\xbdi"}
a string with multiple invalid UTF-8 bytes, e.g. {"h\xff\xffi", "H\xef\xbf\xbd\xef\xbf\xbdi"}

Explaining the invalid UTF-8 strings: Since you are converting your input string to runes, invalid UTF-8 bytes are reported as a value of 0xfffd (the Unicode replacement character), which when encoded as UTF-8 (happens when you convert []rune back to string) results in a sequence of []byte{0xef, 0xbf, 0xbd}; this is what the invalid 0xff byte is compared to in the test case.

Performance wise I would use a string => []rune conversion as you not only need the runes, but you also want to convert them back to a string. Using append() is slower, and it has to assign a slice value (a descriptor) at each rune.

Also note that if we convert the string to []rune, we only need to assign (overwrite) new runes which are converted to their upper case:

This is how I would do it:

func AlternateCase(s string) string {
 rs, upper := []rune(s), false
 for i, r := range rs {
 if unicode.IsLetter(r) {
 if upper = !upper; upper {
 rs[i] = unicode.ToUpper(r)
 }
 }
 }
 return string(rs)
}

Stack Exchange Network

Alternate letters to UpperCase

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Alternate letters to UpperCase

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions