This is an implementation of BIP0039 in Rust (I used python-mnemonic as a guide). I'd like my first Rust program to be critiqued.
I would like to know if the program is structured correctly, if I've chosen the right constructs to use where, any other general comments on what to improve to make it cleaner and more Rust-y
main.rs:
extern crate getopts;
extern crate core;
extern crate mnemonic;
extern crate "rustc-serialize" as serialize;
use mnemonic::Mnemonic;
use serialize::hex::{FromHex, ToHex};
use getopts::{reqopt,optflag,getopts,OptGroup};
use std::os;
use std::iter::repeat;
use std::rand::{OsRng, Rng};
use std::old_io::File;
//getopts help message
fn print_usage(program: &str, _opts: &[OptGroup]) {
println!("Usage: {} [options]", program);
println!("-s\t\tSeed");
println!("-h --help\tUsage");
}
fn main() {
/* start handling opts */
let args: Vec<String> = os::args();
let program = args[0].clone();
let opts = &[
reqopt("s", "seed", "set mnemonic seed", ""),
optflag("h", "help", "print this help menu")
];
let matches = match getopts(args.tail(), opts) {
Ok(m) => { m }
Err(f) => { panic!(f.to_string()) }
};
if matches.opt_present("h") {
print_usage(program.as_slice(), opts);
return;
}
let seed = match matches.opt_str("s") {
Some(x) => x,
None => panic!("No seed given"),
};
/* end opts, seed value below */
let str_seed:&str = seed.as_slice();
let mut rng = match OsRng::new() {
Ok(g) => g,
Err(e) => panic!("Failed to obtain OS RNG: {}", e)
};
let path = Path::new("src/wordslist/english.txt");
let display = path.display();
let mut file = match File::open(&path) {
Err(why) => panic!("couldn't open {}: {}", display, why.desc),
Ok(file) => file,
};
let words:String = match file.read_to_string() {
Err(why) => panic!("couldn't read {}: {}", display, why.desc),
Ok(string) => string,
};
//generate corner cases
for &i in [16us,24,32].iter() {
for &n in ["00","7f","80","ff"].iter() {
let corner_chars = repeat(n).take(i).collect();
process(corner_chars,str_seed,words.as_slice());
}
}
//generate random seeds
for gen_seed in range(0us,12) {
let length = 8 * (gen_seed % 3 + 2);
let random_chars:String = rng.gen_ascii_chars().take(length).collect();
process(random_chars,str_seed,words.as_slice());
}
}
fn process(random_chars:String,str_seed:&str,words:&str) {
println!("random characters: {}",random_chars);
let mnemonic:Mnemonic = Mnemonic::new(random_chars);
let mut mnem_words = Vec::new();
for i in range(0us,mnemonic.binary_hash.len() / 11) {
let bin_idx = mnemonic.binary_hash.slice(i*11,(i+1)*11);
let idx = std::num::from_str_radix::<isize>(bin_idx, 2).unwrap();
mnem_words.push(words.as_slice().words().nth(idx as usize).unwrap()); //check for better way of doing this
}
let str_mnemonic = format!("{:?}",mnem_words);
println!("mnemonic: {}", str_mnemonic);
let key_value = mnemonic.to_seed(str_mnemonic.as_slice(),str_seed); //to_string() on a Vec<&str>?
println!("key: {}",key_value.as_slice().to_hex());
}
lib.rs:
extern crate crypto;
extern crate "rustc-serialize" as rustc_serialize;
use crypto::pbkdf2::{pbkdf2};
use crypto::sha2::{Sha256, Sha512};
use crypto::hmac::Hmac;
use crypto::digest::Digest;
use std::old_io::File;
use rustc_serialize::hex::{FromHex, ToHex};
use std::iter::repeat;
static EMPTY:&'static str = "00000000"; //'
static PBKDF2_ROUNDS:u32 = 2048;
static PBKDF2_KEY_LEN:usize = 64;
pub struct Mnemonic {
pub binary_hash:String,
}
impl Mnemonic {
pub fn new(chars:String) -> Mnemonic {
let h:String = Mnemonic::gen_sha256(chars.as_slice());
//get binary string of random seed
let s_two:String = Mnemonic::to_binary(chars.as_bytes());
//get binary str of sha256 hash
let h_two:String = Mnemonic::to_binary(h.from_hex().unwrap().as_slice());
let length = s_two.len() / 32;
//concatenate the two binary strings together
let random_hash:String = s_two + h_two.slice_to( length ).as_slice();
let mn = Mnemonic {
binary_hash: random_hash,
};
mn
}
pub fn to_seed(&self,mnemonic:&str, seed_value:&str) -> Vec<u8> {
let mut mac = Hmac::new(Sha512::new(),mnemonic.as_bytes());
let mut result:Vec<u8> = repeat(0).take(PBKDF2_KEY_LEN).collect();
let mut salt:String = String::from_str("mnemonic");
salt.push_str(seed_value);
pbkdf2(&mut mac, salt.as_bytes(), PBKDF2_ROUNDS, result.as_mut_slice());
result
}
fn gen_sha256(hashme:&str) -> String {
let mut sh = Sha256::new();
sh.input_str(hashme);
sh.result_str()
}
fn to_binary(input:&[u8]) -> String {
let mut s_two = String::new();
for &s_byte in input.iter() {
let byte_slice = format!("{:b}",s_byte);
let mut empty = String::from_str(EMPTY);
empty.push_str(byte_slice.as_slice());
let slice = empty.slice_from(empty.len()-8);
s_two.push_str(slice);
}
s_two
}
}
I'd love to see some comments on my actual use of struct and whether you think I've made a Mnemonic struct type with attributes that make sense.
For example, I would love to have the mnemonic attribute be the actual word mnemonic (Vec<&str> I imagine) but I'm not sure that makes sense given that a mnemonic may be in multiple languages, and that the binary string is the reference for creating the actual word list.
Additionally, In the process function (main.rs) I have a for.. in which splices the binary string into words by looking them up in the wordlist. It seems logical to me that this would be in the mnemonic function, but I ran into an issue of how to pass a wordlist to the mnemonic without copying the whole darn list, passing a reference gave me lifetime errors.
I'd also like to personally thank you Shepmaster for helping me out while I've been learning Rust, I have asked no less than 5 questions based on this very application and I'm certain you've answered all of them.
1 Answer 1
(I'm still reading and understanding, so I'll keep updating as I grok more and more.)
General style
- Use spaces after colons denoting types.
foo: u8 = 1
instead offoo:u8 = 1
. - Make the spacing between methods consistent and meaningful when it changes. Some of your functions and
impl
s butt right against the previous block. - No spaces inside of parens.
(length)
instead of( length )
. - Spaces after commas.
&self, mnemonic
instead of&self,mnemonic
. - Use type inference when you can.
let mnemonic = Mnemonic::new()
instead oflet mnemonic: Mnemonic = Mnemonic::new()
. - Leave off match arm braces where you can.
Ok(m) => m,
instead ofOk(m) => { m }
. - Don't create a variable just to return it.
54
instead oflet a = 54; a
. - Whenever you are just reading a string, take a
&str
instead ofString
. - Whenever you are reading a slice, take a
&[T]
instead ofVec<T>
.
20000-ft view
I think you are right - your struct doesn't make sense as it is now. None of the struct's methods use self
(and there's only one that even takes it as a parameter). Then in process
, you directly poke into the guts of Mnemonic to do interesting things. I think that you've really got two things - a Mnemonic
(doesn't exist yet) and a MnemonicBuilder
(the current Mnemonic
).
You also should have some dedicated tests. You have the start of these where you "generate corner cases". However, automated tests will save your bacon! They also would have helped me ensure that I didn't break your code as I was "fixing" it. :-)
main
You should split by words just once, then reuse that:
let word_backing: String = match file.read_to_string() {
Err(why) => panic!("couldn't read {}: {}", display, why.desc),
Ok(string) => string,
};
let words: Vec<_> = word_backing.words().collect();
This is the main interesting point. We take &str
s and slices of &str
s. The middle chunk of this, the loop, probably belongs on Mnemonic
. Also, it looks like you are now undoing the binary strings that we created before. If so, then perhaps you should just store a Vec<u8>
and avoid converting to and from binary. You'd save space and time, the best kind of win!
fn process(random_chars: &str, str_seed: &str, words: &[&str]) {
println!("random characters: {}", random_chars);
let mnemonic = Mnemonic::new(random_chars);
let mut mnem_words = Vec::new();
for i in range(0us, mnemonic.binary_hash.len() / 11) {
let bin_idx = &mnemonic.binary_hash[i*11..(i+1)*11];
let idx: isize = std::num::from_str_radix(bin_idx, 2).unwrap();
mnem_words.push(words[idx as usize]);
}
let str_mnemonic = format!("{:?}", mnem_words);
println!("mnemonic: {}", str_mnemonic);
let key_value = mnemonic.to_seed(&str_mnemonic, str_seed);
println!("key: {}", key_value.to_hex());
}
Mnemonic
new
The big breakthrough here was realizing we were making hash
hex for no particular reason. I also beefed up the variable names, giving them a bit more understanding.
pub fn new(seed: &str) -> Mnemonic {
let hash = Mnemonic::gen_sha256(&seed);
let hash_binary = Mnemonic::to_binary(&hash);
let mut seed_binary = Mnemonic::to_binary(seed.as_bytes());
let length = seed_binary.len() / 32;
seed_binary.push_str(&hash_binary[..length]);
Mnemonic {
binary_hash: seed_binary,
}
}
to_seed
The main differences here are
- Using
vec!
to construct a filled vector, this is just a bit simpler to read than the more powerfulrepeat
/take
/collect
. - Removing one
mut
by constructing the string all in one go, usingformat!
- Changing the
static
global values toconst
. The difference is thatstatic
items are accessible at run time by reference.const
is closer in concept to a value just being inlined at the use-site.
Afterwards:
pub fn to_seed(&self, mnemonic: &str, seed_value: &str) -> Vec<u8> {
let mut mac = Hmac::new(Sha512::new(), mnemonic.as_bytes());
let salt = format!("mnemonic{}", seed_value);
let mut result = vec![0u8; PBKDF2_KEY_LEN];
pbkdf2(&mut mac, salt.as_bytes(), PBKDF2_ROUNDS, &mut result);
result
}
gen_sha256
After looking at the constructor, I realized that this should just return a vector of bytes, not a string. We just de-hex the string anyway:
fn gen_sha256(hashme: &str) -> Vec<u8> {
let mut sh = Sha256::new();
sh.input_str(hashme);
let mut result: Vec<u8> = iter::repeat(0).take(sh.output_bytes()).collect();
sh.result(&mut result);
result
}
to_binary
I feel that this is a poorly-named method. I would expect something called
binary
to return a slice of bytes, not a string. Maybeto_binary_string
would be enough.Instead of creating a string with the character
0
, you can use formatting modifiers to have a zero-padded string. Specifically, you want{:08b}
.
Here's a slimmer version of this method, but one that allocates more than needed:
fn to_binary(input:&[u8]) -> String {
let bit_strings: Vec<_> = input.iter().map(|byte| format!("{:08b}", byte)).collect();
bit_strings.concat()
}
And another version that's a bit more cautious about allocations:
use std::fmt::Writer;
fn to_binary(input: &[u8]) -> String {
let mut s = String::new();
for byte in input.iter() {
write!(&mut s, "{:08b}", byte).unwrap();
}
s
}
-
\$\begingroup\$ I've added some additional statements to the original post, I'd be much appreciated if you could read it over and give me your comments. \$\endgroup\$leshow– leshow2015年02月04日 20:28:57 +00:00Commented Feb 4, 2015 at 20:28
-
\$\begingroup\$ @leshow I think I'm done with my first pass ^_^ \$\endgroup\$Shepmaster– Shepmaster2015年02月05日 01:19:43 +00:00Commented Feb 5, 2015 at 1:19
-
\$\begingroup\$ you're great Shep, thanks for the in depth breakdown. \$\endgroup\$leshow– leshow2015年02月05日 17:52:31 +00:00Commented Feb 5, 2015 at 17:52
-
\$\begingroup\$ it will take me a bit to digest all of this. if you want to know why i had things in hex, it's because that was the way it was done in the python implementation (github.com/trezor/python-mnemonic/blob/master/mnemonic/…). \$\endgroup\$leshow– leshow2015年02月05日 18:01:34 +00:00Commented Feb 5, 2015 at 18:01
//'
at the end of lines that introduce an odd number of lifetimes. \$\endgroup\$