After reading a lot about F#, I am today cutting my very first real life F# code and I'd appreciate any feedback on my approach.
I'm writing a simple utility to analyse a log file. The code below represents my approach for identifying sections of the log and representing them in a structure. The "real life" situation is to find application sessions in the log file. The example below model this by parsing a list of numeric strings!
The main areas of relevance are the GetSection
and GetSections
methods below. My questions are:
- Is my approach sensible?
- Is this functional and idiomatic?
- Is there an alternative approach that would be clearer or more elegant?
There is a LinqPad version of the code available here: http://share.linqpad.net/99khmx.linq
// Here are the "lines" from my "log file".
let lines = [1 .. 25] |> List.map (fun x -> x.ToString())
// For this exercise, section criteria is divisibility by 5.
// val DivisibleByFive : s:string -> bool
let DivisibleByFive (s:string) =
System.Int32.Parse(s) % 5 = 0
// I am going to use the following type to hold the information
// from each section of the log.
type Section (lines:string list) =
member x.Lines = lines
// Read a single section from the top of the log (based on: http://fssnip.net/iV).
// val GetSection : lines:string list -> Section * string list
let GetSection (lines:string list) =
let rec getSection (l:string list) result =
match l with
| h::t when DivisibleByFive h ->
let values = h::result |> List.rev
(new Section(values), t)
| h::t when not (DivisibleByFive h) ->
getSection t (h::result)
| _ ->
(new Section(result), [])
getSection lines []
// Get a list of all sections from the log.
// val GetSections : lines:string list -> Section list
let GetSections (lines:string list) =
let rec getSections (l:string list) result =
match GetSection l with
| (a, []) -> (a::result)
| (a, b) -> getSections b (a::result)
getSections lines [] |> List.rev
[<EntryPoint>]
let main argv =
let sections = GetSections lines
sprintf "Parsed %d section from the log:" sections.Length |> System.Console.WriteLine
for i in sections do
sprintf " Section starting with %s" i.Lines.[0] |> System.Console.WriteLine
let s = System.Console.ReadLine()
0
Edit: The refactored code following the comments by @mjolka is below (including the bug fix mentioned in @mjolka's update). The refactored code is available as a LinqPad or a Gist.
// Here are the "lines" from my "log file".
let lines = [1 .. 25] |> List.map (fun x -> x.ToString())
// For this exercise, section criteria is divisibility by 5.
// val DivisibleByFive : s:string -> bool
let DivisibleByFive (s:string) =
System.Int32.Parse(s) % 5 = 0
// I am going to use the following type to hold the information
// from each section of the log.
type Section (lines:string list) =
member x.Lines = lines
/// <summary>
/// Splits the given sequence into non-empty contiguous subsequences, such
/// that the last element of every subsequence (except possibly the last)
/// satisfies the given predicate. No other elements satisfy the predicate.
/// </summary>
/// <example>
/// <c>
/// let even x = x % 2 = 0
/// printfn "%A" (splitAtEach even (seq { 1 .. 3 }))
/// printfn "%A" (splitAtEach even (seq { 1 .. 4 }))
/// </c>
/// The output is:
/// <c>
/// seq [[1; 2]; [3]]
/// seq [[1; 2]; [3; 4]]
/// </c>
/// </example>
let splitAtEach (predicate : 'T -> bool) (xs : seq<'T>) : seq<'T list> =
let section = new ResizeArray<'T>()
seq {
for x in xs do
section.Add x
if predicate x then
yield Seq.toList section
section.Clear()
if section.Any() then
yield Seq.toList section
}
[<EntryPoint>]
let main argv =
let sections =
splitAtEach DivisibleByFive lines
|> Seq.map (fun lines -> new Section(lines))
|> Seq.toList
printfn "Parsed %d section from the log:" sections.Length
sections |> List.iter (fun section -> printfn " Section starting with %s"
let _ = System.Console.ReadLine() // or System.Console.ReadLine() |> ignore
0
1 Answer 1
Welcome to the world of F#! Your code is looking good, and thanks for posting the LinqPad file. Here are a couple of pointers:
sprintf ... |> System.Console.WriteLine
can be replaced by a call toprintfn
.- The variable name
i
in the for loop is misleading, as it looks like an index. Calling itsection
would be better. - If you don't need the result of an expression, such as
System.Console.ReadLine
, it's common to name the variable_
.
Then your main
function would look like this:
let main argv =
let sections = GetSections lines
printfn "Parsed %d section from the log:" sections.Length
for section in sections do
printfn " Section starting with %s" section.Lines.Head
let _ = System.Console.ReadLine()
0
You could also write the for
loop using List.iter
, but it's a matter of personal preference:
sections |> List.iter (fun section -> printfn " Section starting with %s" section.Lines.Head)
Now to the main part of your code. I would say that it's functional and idiomatic, but in my experience with F#, sometimes it's best to compromise on the functional approach.
For example, consider GetSections
. The basic idea is to go through a sequence, building up lists until a certain predicate is met. I feel that the functional approach, with its list reversal, obscures the intention of the code. Here is what I would propose:
Edit Fixed a bug where last elements might not be returned.
/// <summary>
/// Splits the given sequence into non-empty contiguous subsequences, such
/// that the last element of every subsequence (except possibly the last)
/// satisfies the given predicate. No other elements satisfy the predicate.
/// </summary>
/// <example>
/// <c>
/// let even x = x % 2 = 0
/// printfn "%A" (splitAtEach even (seq { 1 .. 3 }))
/// printfn "%A" (splitAtEach even (seq { 1 .. 4 }))
/// </c>
/// The output is:
/// <c>
/// seq [[1; 2]; [3]]
/// seq [[1; 2]; [3; 4]]
/// </c>
/// </example>
let splitAtEach (predicate : 'T -> bool) (xs : seq<'T>) : seq<'T list> =
let section = new ResizeArray<'T>()
seq {
for x in xs do
section.Add x
if predicate x then
yield Seq.toList section
section.Clear()
if section.Any() then
yield Seq.toList section
}
Then GetSections
could be written
let GetSections lines =
splitAtEach DivisibleByFive lines
|> Seq.map (fun lines -> new Section(lines))
|> Seq.toList
Edit Found a bug in your definition of GetSections
, where GetSections ["1"; "2"; "3"] = [["3"; "2"; "1"]]
.
-
2\$\begingroup\$ another common thing is doing something like
System.Console.ReadLine() |> ignore
without ever usinglet
(yeah it reads a bit strange) \$\endgroup\$Random Dev– Random Dev2014年06月29日 13:44:26 +00:00Commented Jun 29, 2014 at 13:44 -
\$\begingroup\$ @mjolka - thank you, that's really useful and has helped me a lot. I agree that
GetSections
andGetSection
were not expressing their intent as clearly as they should have been. That's a valuable lesson! :) Using Seq is much clearer and I can get a whole lot of reuse fromsplitAtEach
in the "real" code. Very much appreciated! An updated LinqPad version is here: share.linqpad.net/as8ivp.linq. \$\endgroup\$Sean Kearon– Sean Kearon2014年06月29日 15:59:45 +00:00Commented Jun 29, 2014 at 15:59 -
\$\begingroup\$ I've also now updated the question to show the refactored code. \$\endgroup\$Sean Kearon– Sean Kearon2014年06月29日 16:12:43 +00:00Commented Jun 29, 2014 at 16:12
-
\$\begingroup\$ @SeanKearon I've fixed a bug in my code, please see the edit. \$\endgroup\$mjolka– mjolka2014年06月29日 23:10:20 +00:00Commented Jun 29, 2014 at 23:10
-
\$\begingroup\$ Thank you - I've updated the finished code above and in the LinqPad sample. \$\endgroup\$Sean Kearon– Sean Kearon2014年06月30日 06:09:42 +00:00Commented Jun 30, 2014 at 6:09