I'm rather new to F# and functional programming. I wrote the following method to concatenate some custom dsl scripts. The function works are expected without errors. I wanted to know if there was a better way to rewrite this function.
let foldScript (storeAppScripts : seq<string * string * string>) =
storeAppScripts
|> Seq.choose( fun (store, app, script) ->
match script with
| "" -> None
| _ -> Some (store, app, script) )
|> Seq.groupBy(fun (store, app, script) -> store)
|> Seq.map(fun (store, group) -> store, (group
|> Seq.groupBy(fun (_, app, _)-> app)
|> Seq.map(fun (app, group) -> app , (group |> Seq.fold(fun accum (_,_, script) -> sprintf "%s\r\n%s" accum script) ""))
|> Seq.map(fun (app, script) -> sprintf "\r\n\tuse Application = %s %s" app (script.ToString()))))
|> Seq.map(fun (store, script) ->
sprintf "use Store=%s%s" store (script |> Seq.fold( fun accum appScript -> sprintf "%s%s" accum appScript) ""))
|> Seq.fold(fun acc script -> sprintf "%s\r\n%s" acc script) ""
let s = [("s1", "a1", "script1"); ("s1", "a2", "script2"); ("s2", "a3", "script4")]
let r = foldScript s
//Expected result
// use Store=s1
// use Application = a1
// script1
// use Application = a2
// script2
// use Store=s2
// use Application = a3
// script1
-
1\$\begingroup\$ looks fine. Does it accomplish what you need? is there an issue with the output? \$\endgroup\$Glenn Ferrie– Glenn Ferrie2016年07月07日 19:11:05 +00:00Commented Jul 7, 2016 at 19:11
-
\$\begingroup\$ It does accomplish what I need. I'm not sure if it is idiomatic f#. I wanted to know if there was some way to make it simpler to read. \$\endgroup\$Satish– Satish2016年07月07日 19:17:30 +00:00Commented Jul 7, 2016 at 19:17
-
\$\begingroup\$ Please state only code purpose in the title (ping me when you're done so I can remove the downvote) \$\endgroup\$Caridorc– Caridorc2016年07月08日 18:10:15 +00:00Commented Jul 8, 2016 at 18:10
2 Answers 2
I'll start with the code and explain after
let foldScript =
let isScriptNonEmpty { Script = script } = script <> ""
let foldAppScripts =
let foldScripts = Seq.fold (fun acc { Script = script } -> sprintf "%s\r\n%s" acc script) ""
Seq.groupBy (fun { App = app } -> app)
>> Seq.map (fun (app, group) -> sprintf "\r\n\tuse Application = %s %s" app <| foldScripts group)
>> Seq.fold (sprintf "%s%s") ""
Seq.filter isScriptNonEmpty
>> Seq.groupBy (fun { Store = store } -> store)
>> Seq.map (fun (store, group) -> sprintf "use Store=%s%s" store <| foldAppScripts group)
>> String.concat "\r\n"
let s = [ { Store = "s1"; App = "a1"; Script = "script1" }; { Store = "s1"; App = "a2"; Script = "script2" }; { Store = "s2"; App = "a3"; Script = "script4" }]
let r = foldScript s
First I made inner "function" to ease readability and reused the
Script
type Reed talked about because it's a good ideaThen you can pattern match (deconstruct) on record almost anywhere so
(if there are multiple record type with same field name you'll have to qualify it but that's not the case here)The
foldAppScripts
"function" have itself an inner "function"I choose to favor function composition (and so made "function value") if it doesn't suit you, you can put back the argument and pipe over it
I used the backward pipe to avoid using parenthesis (also a matter of taste)
The last
Seq.fold
offoldAppScripts
use eta-reduction (transformingfun arg -> fct arg
in justfct
)Your
script.ToString ()
isn't needed asscript
is already a stringI replace your last fold by a call to String.concat
Performance wise you could try to use a StringBuilder
(and use Printf.bprintf
) but it's not so easy to use
We could go further and remove each lambda by a named function (self documentation) but that's maybe overkill.
Edit
After toying a little more with this I think I managed to make the StringBuilder version working.
Not sure it was worth the effort my (simplistic) measurements show a quicker first version (some ms for 100k items so negligible) but the second has a little less pressure on GC gen 1
But that was mainly for the experiment (I finally used Printf.kbprintf
)
here is the code :
let foldScript storeAppScripts =
let builder = System.Text.StringBuilder (60 * Seq.length storeAppScripts)
let bprintf format = Printf.bprintf builder format
let kbprintf continuation format = Printf.kbprintf continuation builder format
let isScriptNonEmpty { Script = script } = script <> ""
let writeAppScripts storeAppScripts () =
let writeScripts scripts () = Seq.iter (fun { Script = script } -> bprintf "\r\n%s" script) scripts
storeAppScripts
|> Seq.groupBy (fun { App = app } -> app)
|> Seq.iter (fun (app, group) -> kbprintf (writeScripts group) "\r\n\tuse Application = %s " app)
bprintf "\r\n"
storeAppScripts
|> Seq.filter isScriptNonEmpty
|> Seq.groupBy (fun { Store = store } -> store)
|> Seq.iter (fun (store, group) -> kbprintf (writeAppScripts group) "use Store=%s" store)
string <| builder.Remove (builder.Length - 2, 2)
- First I put back arguments because it's no more just function composition in the bodies.
- I created a StringBuilder and gave him an initial capacity
The Seq.length means a first loop over the items so the diff in perf comes probably from there but without a proper capacity it'll suffer from many internal buffer resizing - Then I made some helper for
bprintf
andkbprintf
embedding the builder
eta-reduction isn't possible forbprintf
because it would then be a value and not a function so it wouldn't be generic - I renamed the inner function to reflect the change in code they now return
unit
The extra unit parameter isn't strictly required it just allow me to do eta-reduction at call site askbprintf
continuation
argument expect aunit -> 'a
- The rest of the code is nearly the same ; some
Seq.map
changed inSeq.iter
because of the unit return.
TheSeq.fold
vanished which probably means I could simplify theSeq.map (...) >> Seq.fold
in just a fold in the first version - At the end I remove the trailing
"\r\n"
added by the last call towriteAppScripts
; I wanted to find a better way without luck.
Now a little explanation about how bprintf and kbprint works :
- bprintf is "easy" it's a printf which write it's content into the Stringbuilder given (same principle as fprintf with a file)
- kbprintf do the same but take a continuation (a function) which will be executed after (for example to transform the result into another object)
I could have not used it and just put multiple bprintf instead
So for examplekbprintf (writeAppScripts group) "use Store=%s" store
will first write"use Store=XXX"
into the StringBuilder and then callwriteAppScripts
on thegroup
value which will add it's content into the StringBuilder
-
\$\begingroup\$ My input it at most a couple of 100 items. But as a learning exercise second solution gives me more insight into different ways of solving the problem. I really appreciate your extra effort. \$\endgroup\$Satish– Satish2016年07月08日 15:30:33 +00:00Commented Jul 8, 2016 at 15:30
In general, the techniques and logic appear to be fine.
My suggestions for making this more idiomatic would include:
Extract out the tuple into a record. Using a string*string*string
to represent the values(store, app, script)
is less clear than having a type for it:
type Script = { Store : string ; App : string ; Script : string }
Consider using .filter
instead of .choose
when appropriate. You really only need choose
if you're doing a filter and a map operation, or already using options. For a simple filter, you can rewrite:
|> Seq.choose( fun (store, app, script) ->
match script with
| "" -> None
| _ -> Some (store, app, script) )
as
|> Seq.filter( fun (_, _, script) -> script <> "")
Finally, I would consider extracting out the "subqueries" as a function, and just calling them. Having nested lambda expressions is fine, but can be difficult to read.
-
\$\begingroup\$ Thanks, For the line
Seq.groupBy(fun (store, app, script) -> store)
FSharpLint says fun x -> x` might be able to be refactored intoid
. Is that possible? \$\endgroup\$Satish– Satish2016年07月07日 19:37:29 +00:00Commented Jul 7, 2016 at 19:37 -
\$\begingroup\$ @Satish No, it's a linter error - but if you switch to a record, it'll go away too \$\endgroup\$Reed Copsey– Reed Copsey2016年07月07日 19:39:38 +00:00Commented Jul 7, 2016 at 19:39