This is my first post. I'm trying to learn Rust and I recently finished reading the Rust Book.
For further learning I decided to try to reimplement some GNU core utilities in Rust.
This is my approach to echo.
use std::{cell::RefCell, env};
struct Config {
backslash_escapes: RefCell<bool>,
trailing_newline: RefCell<bool>,
}
impl Config {
fn new() -> Self {
Config {
backslash_escapes: RefCell::new(false),
trailing_newline: RefCell::new(true),
}
}
fn find_flags<'a>(
&'a self,
input: impl Iterator<Item = String> + 'a,
) -> impl Iterator<Item = String> + 'a {
let input = input.filter_map(|word| match word.as_str() {
"-e" => {
*self.backslash_escapes.borrow_mut() = true;
None
}
"-n" => {
*self.trailing_newline.borrow_mut() = false;
None
}
"-en" => {
*self.backslash_escapes.borrow_mut() = true;
*self.trailing_newline.borrow_mut() = false;
None
}
"-ne" => {
*self.backslash_escapes.borrow_mut() = true;
*self.trailing_newline.borrow_mut() = false;
None
}
"-E" => {
*self.backslash_escapes.borrow_mut() = false;
None
}
_ => Some(word),
});
input
}
}
fn main() {
// println!("{}", std::env::args().skip(1).format(" ")); // most simple version?
let config = Config::new();
let input = env::args().skip(1);
// does defining content here increase overhead?
let mut content = config.find_flags(input);
// i dont like that i have to use replace_escapes for first_word seperatly here.
// i can try to map over the whole content first and replace escapes words (w/o consuming the Iterator!)?
if let Some(first_word) = &content.next() {
if *config.backslash_escapes.borrow() {
print!("{}", replace_escapes(first_word));
} else {
print!("{}", first_word);
}
content.for_each(|word| {
if let Some(word) = Some(word) {
if *config.backslash_escapes.borrow() {
print!(" {}", replace_escapes(&word));
} else {
print!(" {}", word);
}
}
});
if *config.trailing_newline.borrow() {
println!("");
}
};
fn replace_escapes(word: &String) -> &str {
match word.as_str() {
"\\a" => "\x07", // alert (BEL)
"\\b" => "\x08", // backspace
"\\c" => "(STOPPPP!)", // produce no further output
"\\e" => "\x1b", // escape
"\\f" => "\x0c", // form feed
"\\n" => "\n", // newline
"\\r" => "\r", // carriage return
"\\t" => "\t", // horizontal tab
"\\v" => "\x0B", // vertical tab
_ => word,
}
}
}
I tried to incorporate feedback of other reimplementation's of echo. So for my version I tried:
- not use crates
- to not allocate the data to a vector, and to not consume the Iterator until the output.
- still implement option flag, although this gave me substantial headache with borrowing rules.
My questions:
I'm sure the Overhead of using a vector is minimal in this case, still I wanted to learn about the functional programming approaches rust has to offer with Iterator
At any Point in my code? Do I allocate data that is not needed? for example by defining the content variable?
Is there a more elegant solution to using the boolean variable
backslash_escapes
, which I capture in the closure of thefind_flags
method?if I read the book correctly
RefCell
has an overhead in enforcing the ownership rules at runtime. Is this CPU overhead? Is it still worth not collecting the Iterator in a Vector?Am I overcomplicating things? Or is an approach like this useful when working with larger amounts of data?
I appreciate every roast, criticism, and comment!
-
\$\begingroup\$ I like your first sentence. Congratulations for that. And welcome. \$\endgroup\$Billal BEGUERADJ– Billal BEGUERADJ2023年11月23日 10:28:18 +00:00Commented Nov 23, 2023 at 10:28
1 Answer 1
Well, first of all, your code doesn't work the way it should. :D
Here is the output of echo
for an example input
$echo -e 'ff\ngg\n' 'aaa\tbbb'
ff
gg
aaa bbb
And here is the output of your program
./main -e 'ff\ngg\n' 'aaa\tbbb'
ff\ngg\n aaa\tbbb
As you can see, it does not print whitespace properly. Writing unit tests for your code can prevent such mistakes.
The iterator approach seems interesting. But I don't think there is any need to use RefCell here. It contradicts the whole point of functional programming which is having immutable states.
Since flags only appear before other arguments I think you can check for flags first, create your config by consuming them, and then return the rest of the iterator for further processing.
Something like this:
struct Config {
backslash_escapes: bool,
trailing_newline: bool,
}
impl Config {
fn new() -> Self {
Config {
backslash_escapes: false,
trailing_newline: true,
}
}
fn make_new_config(self, flag: &str) -> Config {
match flag {
"-e" => Config {
backslash_escapes: true,
..self
},
"-n" => Config {
trailing_newline: false,
..self
},
"-en" | "-ne" => Config {
backslash_escapes: true,
trailing_newline: false,
},
"-E" => Config {
backslash_escapes: false,
..self
},
_ => self,
}
}
fn check_flags(
input: impl Iterator<Item = String>,
mut config: Config,
) -> (Config, impl Iterator<Item = String>) {
let mut peekable = input.peekable();
while let Some(flag) = peekable.next_if(|arg| match arg.as_str() {
"-E" | "-e" | "-n" | "-en" | "-ne" => true,
_ => false,
}) {
config = config.make_new_config(&flag);
}
(config, peekable)
}
}
-
2\$\begingroup\$ echo for me
echo 'ff\ngg\n' 'aaa\tbbb'
prints this:ff\ngg\n aaa\tbbb
so no extra white space of next line or tab, do you mean maybe printf? Maybe you wanted to say the-e
option?echo -e 'ff\ngg\n' 'aaa\tbbb' ff gg aaa bbb
\$\endgroup\$K Y– K Y2023年11月27日 18:09:48 +00:00Commented Nov 27, 2023 at 18:09 -
\$\begingroup\$ yes, i meant it with
-e
option. i tested it on bash which has builtin echo command with-e
as the default case. but for GNU echo program -E is the default case. this post explains the difference: unix.stackexchange.com/questions/153660/… \$\endgroup\$Sajjad Reyhani– Sajjad Reyhani2023年11月28日 13:48:58 +00:00Commented Nov 28, 2023 at 13:48 -
\$\begingroup\$ And you could
#[derive(clap::Parser)]
for the config above, if OP would not like to avoid using other crates. \$\endgroup\$Richard Neumann– Richard Neumann2023年11月30日 13:31:01 +00:00Commented Nov 30, 2023 at 13:31