The Rust Programming Language Pig Latin Exercise

Question 1

This is my implementation of the Pig Latin recommended exercise in The Rust Programming Language book. I am using the unicode segmentation crate to split the string into words while also keeping the delimiters. Any pointers on making this code more idiomatic or run more optimal?

use unicode_segmentation::UnicodeSegmentation;
#[allow(overlapping_patterns)]
fn translate_word(s: &str) -> String {
 let mut it = s.chars();
 let first = it.next().unwrap();
 match first.to_ascii_lowercase() {
 'a' | 'e' | 'i' | 'o' | 'u' => format!("{}-hay", s),
 'a'..='z' => format!("{}-{}ay", it.collect::<String>(), first),
 _ => s.to_string(),
 }
}
pub fn translate(s: &str) -> String {
 s.split_word_bounds()
 .map(translate_word)
 .collect::<Vec<_>>()
 .join("")
}

The code is inside a module named pig_latin.

Question 2

Be aware that Rust typically uses 4 spaces

It's fine if you consistently use 2 spaces (especially if you override it in rustfmt.toml), but just be aware that the standard is different.

Collect directly to a `String`

Instead of collecting to a Vec and then copying that over to a new Vec (within String), collect to a String directly:

pub fn translate(s: &str) -> String {
 s.split_word_bounds().map(translate_word).collect()
}

Use `Chars::as_str`

When you use str::chars, the specific iterator that it returns is called Chars. It has a handy function to get the remaining part, so you don't need to allocate a new string:

'a'..='z' => format!("{}-{}ay", it.as_str(), first),

Use `Cow`

This is a bit of an advanced optimization that you don't need to do—the Book doesn't even mention it once.

Currently, you allocate a new String even when the output is identical to the input. Instead, return Cow<str>: if the first character isn't a letter, you can return Cow::Borrowed(s), which points to the existing &str. If it does start with a letter, return Cow::Owned(format!(...)), which has the same overhead as it did before. Here, I'm using .into() instead of writing Cow::Owned and Cow::Borrowed explicitly. You can do either.

fn translate_word(s: &str) -> Cow<str> {
 let mut it = s.chars();
 let first = it.next().unwrap();
 match first.to_ascii_lowercase() {
 'a' | 'e' | 'i' | 'o' | 'u' => format!("{}-hay", s).into(),
 'a'..='z' => format!("{}-{}ay", it.as_str(), first).into(),
 _ => s.into(),
 }
}

Final code

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=78daca7b7adab4587436cacf35bca90e

lights0123 lights0123 4812 silver badges4 bronze badges · Accepted Answer · 2020-12-28 03:51:05Z

Be aware that Rust typically uses 4 spaces

It's fine if you consistently use 2 spaces (especially if you override it in rustfmt.toml), but just be aware that the standard is different.

Collect directly to a `String`

Instead of collecting to a Vec and then copying that over to a new Vec (within String), collect to a String directly:

pub fn translate(s: &str) -> String {
 s.split_word_bounds().map(translate_word).collect()
}

Use `Chars::as_str`

When you use str::chars, the specific iterator that it returns is called Chars. It has a handy function to get the remaining part, so you don't need to allocate a new string:

'a'..='z' => format!("{}-{}ay", it.as_str(), first),

Use `Cow`

This is a bit of an advanced optimization that you don't need to do—the Book doesn't even mention it once.

Currently, you allocate a new String even when the output is identical to the input. Instead, return Cow<str>: if the first character isn't a letter, you can return Cow::Borrowed(s), which points to the existing &str. If it does start with a letter, return Cow::Owned(format!(...)), which has the same overhead as it did before. Here, I'm using .into() instead of writing Cow::Owned and Cow::Borrowed explicitly. You can do either.

fn translate_word(s: &str) -> Cow<str> {
 let mut it = s.chars();
 let first = it.next().unwrap();
 match first.to_ascii_lowercase() {
 'a' | 'e' | 'i' | 'o' | 'u' => format!("{}-hay", s).into(),
 'a'..='z' => format!("{}-{}ay", it.as_str(), first).into(),
 _ => s.into(),
 }
}

Final code

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=78daca7b7adab4587436cacf35bca90e

Stack Exchange Network

The Rust Programming Language Pig Latin Exercise

1 Answer 1

Be aware that Rust typically uses 4 spaces

Collect directly to a `String`

Use `Chars::as_str`

Use `Cow`

Final code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

The Rust Programming Language Pig Latin Exercise

1 Answer 1

Be aware that Rust typically uses 4 spaces

Collect directly to a String

Use Chars::as_str

Use Cow

Final code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

Collect directly to a `String`

Use `Chars::as_str`

Use `Cow`