With my rustymines 1 2 project, I realized, that it was cumbersome, to copy and paste all the code files into Markdown for review here. So I wrote a small program that formats source code files as Markdown for me:
src/main.rs
:
use std::collections::HashMap;
use std::env::args;
use std::fs::File;
use std::io::{BufReader, Error, Read};
use std::path::Path;
use lazy_static::lazy_static;
lazy_static! {
static ref LANGUAGES: HashMap<&'static str, &'static str> = HashMap::from([
("c", "c"),
("cpp", "cpp"),
("h", "cpp"),
("py", "python"),
("rs", "rust"),
("toml", "toml"),
]);
}
fn read_file(file: &Path) -> Result<String, Error> {
let fh = File::open(file)?;
let mut buf_reader = BufReader::new(fh);
let mut content = String::new();
match buf_reader.read_to_string(&mut content) {
Ok(_) => Ok(content),
Err(code) => Err(code),
}
}
fn format_file(file: &Path) {
match read_file(file) {
Ok(text) => {
let extension = match file.extension() {
Some(suffix) => suffix.to_str().unwrap_or(""),
None => "",
};
let language = LANGUAGES.get(extension).unwrap_or(&"");
match file.to_str() {
Some(filename) => {
println!("`{}`:", filename);
println!("```{}\n{}```", language, text);
}
None => eprintln!("Could not extract file name."),
}
}
Err(code) => eprintln!("Error reading file: {}", code),
}
}
fn main() {
let args: Vec<String> = args().collect();
args.iter()
.enumerate()
.filter(|&(index, _)| 0 < index)
.map(|(_, filename)| Path::new(filename))
.for_each(format_file)
}
Cargo.toml
:
[package]
name = "mdcode"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
lazy_static = "*"
The markdown for file names and code blocks above has been generated by the script itself (which technically makes it a quine, no?) like so:
$ target/release/mdcode src/main.rs Cargo.toml
I'd like to have feedback on what I could improve here.
1 Answer 1
Bug
Take a look at what happens if I combine block doc comments and doc test. This perfectly reasonable code:
/**
Adds two numbers
```
# use mdcode::add::add;
# fn main(){
assert_eq!(add(2, 2), 4);
assert_eq!(add(-2, 2), 0);
# }
```
*/
pub fn add(a:i32, b:i32) -> i32{
a + b
}
renders to this:
src/add.rs
:
/**
Adds two numbers
use mdcode::add::add;
fn main(){
assert_eq!(add(2, 2), 4);
assert_eq!(add(-2, 2), 0);
}
*/
pub fn add(a:i32, b:i32) -> i32{
a + b
}```
Obviously not good. Either somehow escape the backticks or use indentation-based code blocks instead. Additionally, add a newline before the last set of backticks- I had to add an extra set of backticks to keep the code block from eating my post.
Take in an AsRef<Path>
instead of &Path
It's more generic, and since most people using file-based APIs just have a string representing the path anyway, it makes their life easier.
Remember Iterator::skip
exists
Fairly self-explanatory, and in conjunction with the previous tip, turns main
into
fn main() {
args().skip(1).for_each(format_file);
}
which is much cleaner.
read_file
already exists
There is a convenience function, std::fs::read_to_string
to read a file from a string already. Your implementation wasn't bad or anything, but there's no need to reinvent the wheel.
Take advantage of Option
's combinators
Option
and Result
have lots of convenient ways to combine and chain fallible functions. In particular, the match statement for getting the file extension should be
let extension = file
.extension()
.and_then(|ext| ext.to_str())
.unwrap_or_default();
Consider using the log
crate for logging
For a project of this size, the ad-hoc eprintln!
based logging implementation you have is probably fine, but if your project gets bigger that approach very quickly becomes unmaintainable. Using log
plus your favorite backend enables easily configurable logging suitable for whatever your project's needs may be.
Prefer once_cell
to lazy_static
There's nothing wrong with lazy_static
, but once_cell
has more features and also mirrors the API of the equivalent unstable types in the standard library. Additionally, avoid setting the version to "*"
as that is a very good way to cause weird version-related compilation errors, which really suck to deal with.
Set a few more fields in your Cargo.toml
Cargo.toml
files have lots of supported keys, and you should set a few more of them. At minimum, you should probably set the authors
and license
keys. A few other keys, like repository
and readme
, would be ideal as well.
Final thoughts
Your code is pretty good overall. It seems like you've learned the language pretty well, but still have some work to do learning the standard library and ecosystem. Improved, but not quite complete code:
src/main.rs
:
use std::collections::HashMap;
use std::env::args;
use std::path::Path;
use once_cell::sync::Lazy;
static LANGUAGES: Lazy<HashMap<&'static str, &'static str>> = Lazy::new(|| {
HashMap::from([
("c", "c"),
("cpp", "cpp"),
("h", "cpp"),
("py", "python"),
("rs", "rust"),
("toml", "toml"),
])
});
fn format_file<A: AsRef<Path>>(file: A) {
let file = file.as_ref();
match std::fs::read_to_string(file) {
Ok(text) => {
let extension = file
.extension()
.and_then(|ext| ext.to_str())
.unwrap_or_default();
let language = LANGUAGES.get(extension).unwrap_or(&"");
//FIXME: escape the text properly, or use indentation
match file.to_str() {
Some(filename) => {
println!("`{}`:", filename);
println!(
"```{}\n{}{}```",
language,
text,
if !text.ends_with('\n') { "\n" } else { "" }
);
}
None => eprintln!("Could not extract file name."),
}
}
Err(code) => eprintln!("Error reading file: {}", code),
}
}
fn main() {
args().skip(1).for_each(format_file);
}
Cargo.toml
:
[package]
name = "mdcode"
authors = ["your name here"]
version = "0.1.0"
edition = "2021"
license = "MIT"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
once_cell = "1.15"