I'm writing a code to read data from text file in Rust. The target files have two columns and non-fixed length rows, for example,
example.txt
1 1
2 4
3 9
4 16
To achieve this, I wrote following code.
use std::fs::read_to_string;
pub fn open_file(filename: &str) -> (Vec<f32>, Vec<f32>){
let data = read_to_string(filename);
let xy = match data {
Ok(content) => content,
Err(error) => {panic!("Could not open or find file: {}", error);}
};
let xy_pairs: Vec<&str> = xy.trim().split("\n").collect();
let mut x: Vec<f32> = Vec::new();
let mut y: Vec<f32> = Vec::new();
for pair in xy_pairs {
let p: Vec<&str> = pair.trim().split(" ").collect();
x.push(p[0].parse().unwrap());
y.push(p[1].parse().unwrap());
}
(x, y)
}
I'm a beginner of Rust, so I don't know that I follow the best practice in Rust and this is the most efficient code.
I want some advice about this function.
1 Answer 1
I honestly think you should be breaking up the treatment of the file into two parts, as you are already (implicitly) doing so in your function. One part to read the file into a line iterator, another to actually parse.
The "reading into an iterator" is a simple combination of std::fs::File::read_line
; I'll instead focus on the other part:
pub fn read_xy_pairs(i: impl Iterator<Item = String>) -> (Vec<f32>, Vec<f32>) {
i
.map(|string| -> Vec<f32> {
string.split(" ")
.take(2)
.map(|element| element.parse::<f32>())
.filter(|element| element.is_ok())
.map(|element| element.unwrap())
.collect()
})
.filter(|item| item.len() == 2)
.map(|mut item| (item.swap_remove(0), item.swap_remove(0)))
.unzip()
}
The processing is clearly laid out, in order:
- I convert each element of the iterator, in turn, into an iterator of
&str
viaString::split()
- I then truncate this iterator to only take the first two elements lazily
- Each element of this iterator gets
parsed()
and I filter any element that isn't okay, unwrapping those that are - This iterator gets collected into a
Vec<f32>
- From there, I filter out any element that does not have exactly 2 components, and convert those that do into a tuple2 of (f32, f32)
- I then
unzip
this iterator of(f32, f32)
into a(Vec<f32>, Vec<f32>)
to fit your type signature
-
\$\begingroup\$ Thank you for your polite answer, but I have some questions. i) you use
map
twice (one with arrow operator (?) and another without), what is the difference? ii) I cannot understand the behavior ofswap_remove
enough. Could you teach me more or give me some references? \$\endgroup\$Y. P– Y. P2019年10月23日 14:28:58 +00:00Commented Oct 23, 2019 at 14:28 -
\$\begingroup\$ Hey @Y.P, and sorry about the delay. In order, 1) is not an arrow operator but a type definition. I am specifically telling the compiler the
map
closure will output aVec<_>
in order to not have to store it as an intermediate variable forcollect()
to be able to infer the return type. ii)swap_remove()
removes the element at position N and returns it. As I am removing items, I need to make sure I remove the first one twice (as the second element will have shifted to position 1 after the first removal). Let me know if you have any other questions! \$\endgroup\$Sébastien Renauld– Sébastien Renauld2019年10月24日 14:40:09 +00:00Commented Oct 24, 2019 at 14:40
f33
type in line 3. Since fixing this typo doesn't invalidate any existing answer you may still do that. \$\endgroup\$