I'm new in OCaml and just want to be sure that I write code in "ocaml way". My other first programs in OCaml was imperative and looks like python-code. So I decided to write simple code and share it for review.
let print lst =
List.map string_of_int lst
|> String.concat " "
|> print_endline
;;
let rec sort lst =
let sorted = match lst with
| hd1 :: hd2 :: tl ->
if hd1 > hd2 then
hd2 :: sort (hd1 :: tl)
else
hd1 :: sort (hd2 :: tl)
| tl -> tl
in
if lst = sorted then
lst
else
sort sorted
;;
let () =
let lst1 = [3; 4; 2; 1] in
print lst1;
sort lst1 |> print;
let lst2 = [5; 6; 7; 0; 1; 4; 2; 9; 3; 8] in
print lst2;
sort lst2 |> print;
;;
Result:
3 4 2 1
1 2 3 4
5 6 7 0 1 4 2 9 3 8
0 1 2 3 4 5 6 7 8 9
2 Answers 2
Your code is easy to understand, so I can only suggest minor changes. The first thing is just a preference of mine: if a function body is nothing but a match ... with
expression, I use the function
keyword, instead. The second, more important recommendation is to use three clauses in the match body so you don't build a new, temporary list (hd2 :: tl
) as you recurse. With both of these minor changes, sort
looks like this:
let rec sort lst =
let sorted = function
| hd1 :: hd2 :: tl when hd1 > hd2 ->
hd2 :: sort (hd1 :: tl)
| hd1 :: tl ->
hd1 :: sort tl
| tl -> tl
in
if lst = sorted then
lst
else
sort sorted
As the list gets more and more sorted, the second pattern gets matched more often and simply travels down the list. Your version continues to build a new list -- even when the list is sorted.
One other thing: OCaml has an equality operator that tests for physical equality (in addition to structural equality.) So we can avoid building a new list in the second case, if it's not necessary:
| (hd1 :: tl) as lst ->
let tl' = sort tl in
if tl' == tl then
lst
else
hd1 :: tl'
Both these changes will prevent a lot of temporary objects from being created. Unfortunately, it's still a bubble sort, so it's never going to be a speed-demon. :D
I feel like we can clean up your function a little bit by:
- Using conditional guards.
- Matching the simplest base case (an empty list) first.
- And by not bothering to match
hd1::hd2::tl
and then immediately rebuildhd2::tl
.
let rec sort lst =
let sorted = match lst with
| [] -> []
| hd1::hd2::tl when hd1 > hd2 -> hd2 :: sort (hd1 :: tl)
| hd1::tl -> hd1 :: sort tl
in
if lst = sorted then lst
else sort sorted
Now your conditional to check that lst = sorted
is doing an O(n) check for structural equality. This is basically checking to see if no swaps happened, but we can just keep track of this as we iterate through in the first place.
let rec sort lst =
let rec pass ?(swap=false) = function
| [] -> (swap, [])
| hd1::hd2::tl when hd1 > hd2 ->
let (swap, tl) = pass ~swap: true (hd1 :: tl) in
(swap, hd2 :: tl)
| hd1::tl ->
let (swap, tl) = pass ~swap tl in
(swap, hd1 :: tl)
in
match pass lst with
| false, lst -> lst
| _, lst -> sort lst
Now of course, we'd also want to realize that in an imperative bubble sort, we know that on each pass we've sorted the largest element to the end of the list, so we need to sort one less element of the list. Now, since the last element of an OCaml list is expensive to access, this optimization doesn't make much sense.
You do probably want to abstract out a comparison function rather than simply relying on the polymorphic >
operator.
let rec sort cmp lst =
let rec pass ?(swap=false) = function
| [] -> (swap, [])
| hd1::hd2::tl when cmp hd1 hd2 > 0 ->
let (swap, tl) = pass ~swap: true (hd1 :: tl) in
(swap, hd2 :: tl)
| hd1::tl ->
let (swap, tl) = pass ~swap tl in
(swap, hd1 :: tl)
in
match pass lst with
| false, lst -> lst
| _, lst -> sort cmp lst
Printing
An improvement to the printing component of your program can be found in the Format
module in the standard library. As of OCaml 4.02 (released in 2015) it contains pp_print_list
.
Using it, we can move from:
let print lst = List.map string_of_int lst |> String.concat " " |> print_endline
Which we might incidentally use |>
again to change to:
let print lst =
lst
|> List.map string_of_int
|> String.concat " "
|> print_endline
To:
let pp_print_int fmt i =
Format.fprintf fmt "%d" i
let pp_print_space fmt () =
Format.fprintf fmt " "
let print lst =
Format.printf
"%a\n"
(Format.pp_print_list ~pp_sep: pp_print_space pp_print_int)
lst
This approach has the benefit of avoiding the creation of several strings, and the work of concatenating them together.