6
\$\begingroup\$

This code is meant to take in a command line argument and output a 'tag cloud'. It's more of an exercise in learning F# for me because this is my first non-tutorial code file. How could this code be improved?

open System.IO
open System
[<EntryPoint>]
let main args = 
 let directory = args.[0]
 let fileNames = Directory.EnumerateFiles(directory, "*Tests.cs", SearchOption.AllDirectories) |> Seq.toList
 let allLines = List.collect(fun (x:string) -> System.IO.File.ReadLines(x) |> Seq.toList) fileNames
 let allWords = List.collect(fun (x:string) -> x.Split([|' ';'_';';';':';'(';')';'\\';'/';'>';'<';'{';'}';'0';'1';'2';'3';'4';'5';'6';'7';'8';'9';'.'|], StringSplitOptions.RemoveEmptyEntries) |> Seq.toList) allLines
 //printfn "%A" allWords
 let countOccurance (word:string) list = 
 let count = List.filter (fun x -> word.Equals(x)) list
 (word, count.Length)
 let distinctWords = allWords |> Seq.distinct |> Seq.toList
 let print (tup:string*int) =
 match tup with
 | (a,b) -> printfn "%A: %A" a b
 let rec wordCloud distinct (all:string list) (acc:(string*int) list) =
 match distinct with
 | [] -> acc
 | head :: tail -> 
 let accumSoFar = acc @ [(countOccurance head all)]
 wordCloud tail all accumSoFar
 let acc = []
 let cloud = (wordCloud distinctWords allWords acc)
 let rec printTup (tupList:(string*int) list) =
 match tupList with
 | [] -> 0
 | head :: tail -> 
 printfn "%A" head
 printTup tail
 printTup cloud
 0
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Apr 21, 2015 at 12:22
\$\endgroup\$

1 Answer 1

4
\$\begingroup\$

inb4 Scott :-)

Use functions more. Composability is not very important in this example, but it's good to get into that habit early on.

In that spirit, turn the non-trivial value bindings into functions instead of closing over a previously defined value. Also try turning the non-trivial lambda expressions into functions; that gives them a name and makes the call site more readable.

Also, pipe more. :-) Especially for the list/sequence functions that take a higher order function, it's more readable to pipe the list into them, because it makes it more obvious what you're starting from.

Also, in my experience, defining "constant" values and functions first before actually doing any work makes the code much easier to follow, because you don't have to think about what values you already "have" at which point.

Oh, and while tail recursion is a very cool thing to have, it's not very useful here and actually makes the code quite a bit less understandable.

Did you put the type annotations in on purpose? In most cases, you don't need them because of F#'s powerful type inference. They can sometimes make code more easy to follow, but they might also constrain it where it could actually be more generic, and in general not having to specify types all the time is one of the really nice things about F#. :-)

Edit: Reading this again made me think of another important point: The individual function declarations don't belong in the main function; moving them out of there makes the distinction between "parts" and the actual "running" program even clearer.

With these things taken into consideration, your code could look like this:

open System
open System.IO
// It's a tiny bit more maintainable to keep this list separate, and makes it more readable where it is used
let separators = [|' ';'_';';';':';'(';')';'\\';'/';'>';'<';'{';'}';'0';'1';'2';'3';'4';'5';'6';'7';'8';'9';'.'|]
let fileNamePattern = "*Tests.cs"
// Function with the directory and pattern as a parameter
let getFileNames fileNamePattern directory =
 Directory.EnumerateFiles(directory, fileNamePattern, SearchOption.AllDirectories)
 |> Seq.toList
// Function with the list of filenames as a parameter
let getAllLines fileNames = fileNames |> List.collect (System.IO.File.ReadLines >> Seq.toList)
// Function with the list of lines as a parameter
let getAllWords lines =
 lines
 |> List.collect (fun (line : string) ->
 line.Split(separators, StringSplitOptions.RemoveEmptyEntries)
 |> Seq.toList)
// Reordered parameters for pipeability
let countOccurrences allWords word =
 // The = operator can be used as a function that takes two arguments.
 let count = allWords |> List.filter ((=) word) |> List.length
 word, count
// Function that takes a sequence of "anything" and returns a list of the distinct values
let distinctWords = Seq.distinct >> Seq.toList
// F# string formatting is strongly typed and compiler checked; take advantage of that, don't just use %A
let printWordCount (word, count) = printfn "%s: %i" word count
// Now that everything is "set up", we can "run" with the current data:
[<EntryPoint>]
let main args =
 // This binding isn't really necessary, but it's nice to give the value a name
 let directory = args.[0]
 // We now have a clear execution path that is very easy to follow
 let allWords =
 directory
 |> getFileNames fileNamePattern
 |> getAllLines
 |> getAllWords
 let distinct = distinctWords allWords
 distinct
 |> List.map (countOccurrences allWords)
 |> List.iter printWordCount
 0
answered Apr 21, 2015 at 14:58
\$\endgroup\$
3
  • \$\begingroup\$ A lot of my difficulties with F# so far have been with type inference and things becoming arrays when I think they are lists and so on, which is why I was annotating params so often. This code is much clearer, wow! I need to really sit down and get a feel for it. \$\endgroup\$ Commented Apr 21, 2015 at 16:01
  • \$\begingroup\$ I see; using type annotations is a good strategy then for verifying your understanding and expectations, yes. \$\endgroup\$ Commented Apr 21, 2015 at 16:05
  • \$\begingroup\$ I realized the function declarations don't actually belong inside the main method (they could really live "anywhere"); I have updated the code accordingly. \$\endgroup\$ Commented Apr 22, 2015 at 0:29

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.