I have been working through the Cryptopals challenges in Ocaml. My interest in Ocaml is to better learn functional programming techniques and effective usage of the type system.
Set 1 Challenge 4 says 'One of the 60-character strings in this file has been encrypted by single-character XOR. Find it.'
My code finds the correct string. I tried using user-defined types (e.g. xor_input
, xor_output
) to avoid passing around lots of string
types that could easily be muddled. However, I am not sure of what is an appropriate level of type annotation. For example:
- is there any merit in
make_ranking
specifying a return type ofscore
if it ends in|> Score
? My understanding of Haskell is that I could use the function's type annotation to specify that it returns aScore
and theint
returned from thefold
would automatically be wrapped in ascore
type - is there something similar in Ocaml?|> Score
seems a cumbersome way to specify the return type. - I was also unsure how to convert to and from a user-defined type and its unwrapped value (e.g.
score
andint
). My approach insort_results
andoutput
(destructuring) seems inelegant to me. - in
sort_results
I destructure the arguments to get the scores. Does this not mean that any type with ascore
field will be accepted?
(Note that Hexx
is a module I wrote to convert a base64 encoded string to a hex string. Assume the input is a file containing a list of strings.)
open Core
let english = "etaoin srhldcumfpgwybvkxjqzETAOINSRHLDCUMFPGWYBVKXJQZ0123456789.?!"
type score = Score of int
type xor_input = Xor_input of string
type xor_output = Xor_output of string
type result =
{
test_char : char;
score : score;
xor_input : xor_input;
xor_output : xor_output;
}
let xor c (Xor_input input) : xor_output =
String.map input ~f:(fun l -> int_of_char l |> (lxor) c |> char_of_int) |> Xor_output
let make_ranking (Xor_output str) : score =
String.fold str ~init:0 ~f:(fun total c ->
total + match String.index english c with
| Some v -> v
| None -> 100 (* arbitary value for non-letters *)
) |> Score
let make_result (input : xor_input) c : result =
let output = xor c input in
{ test_char = char_of_int c; score = make_ranking output; xor_input = input; xor_output = output }
let sort_results { score = first; _ } { score = second; _ } =
let (Score first_val) = first in
let (Score second_val) = second in
first_val - second_val
let find_best_match results : result =
let best = List.sort ~cmp:sort_results results |> List.hd in
match best with
| Some x -> x
| None -> failwith "No best result found"
let xor_line line =
let decoded = Hexx.decode line in
List.range 0 255
|> List.map ~f:(make_result (Xor_input decoded))
|> find_best_match
let output {test_char; score; xor_input; xor_output} =
let (Score s) = score in
let (Xor_input input) = xor_input in
let (Xor_output out) = xor_output in
fprintf stdout "Char: %c, Score: %d, Original: %s, Output: %s\n" test_char s input out
let () =
In_channel.read_lines "4.txt"
|> List.map ~f:xor_line
|> find_best_match
|> output
1 Answer 1
Congrats on tackling cryptopals! These are great exercises and very interesting to learn new languages and techniques.
Syntax
Remove the type annotations
In most cases in OCaml, you don't need any type annotations. So instead of:
let make_result (input : xor_input) c : result =
It's more usual to write:
let make_result input c =
The reason is that for exposed functions, the type will be in the .mli
file, and for local functions, it's only one key binding away in your editor thanks to merlin.
Pipelines
The pipe operator is powerful and works well in some cases like your main function, but combined with partial application and infix operators, it can become hard to read.
I'd rewrite:
fun l -> int_of_char l |> (lxor) c |> char_of_int
As:
fun l -> char_of_int (c lxor (int_of_char l))
Pattern tricks
In places like this:
let (Xor_input input) =
You can omit the parens and just write:
let Xor_input input =
You can also perform several layers of pattern matching in one step. So rather than writing
let sort_results { score = first; _ } { score = second; _ } =
let (Score first_val) = first in
let (Score second_val) = second in
You can write:
let sort_results { score = Score first_val; _ } { score = Score second_val; _ } =
Record shorthand syntax
When building a record, parts that look like label = label
can be written just label
. So you can rename variables and change:
let make_result input c =
let output = xor c input in
{ test_char = char_of_int c; score = make_ranking output; xor_input = input; xor_output = output }
To:
let make_result xor_input c =
let xor_output = xor c xor_input in
{ test_char = char_of_int c; score = make_ranking output; xor_input; xor_output }
Type design
I don't think that the xor_input
and xor_output
bring a lot here. You're probably be as OK with plain strings.
Other remarks
You are sorting a whole list just to get the minimal element. There is probably a function to do that in core
(and it will be linear).
Instead of fprintf stdout
, you can just use printf
.
That's it! Have a nice journey learning OCaml and please post other examples on this site :)