I need to split a string into parts of varying predefined lengths, given as a sequence of integers. After a first imperative attempt using a sequence expression and a reference cell, I have now come up with this implementation that uses Seq.scan
and doesn't need mutability:
let segmentString lengths (input : string) =
let folder (start, parts) lengthHere =
let partHere = input.Substring(start, lengthHere)
start + lengthHere, partHere :: parts
let results = lengths |> Seq.scan folder (0, [])
let _, resultParts = Seq.last results
List.rev resultParts
This works and is probably "good enough", but I wonder whether there are things that could be improved or more idiomatic (other than error handling and allowing more generic inputs than strings; I left those out because they are not a concern right now). What especially bugs me is the need to reverse the results list at the end, but obviously, I could only add the new part as the head of the aggregate list in the folder
function.
2 Answers 2
This is a pretty good solution but like you say, the reverse is annoying, so let's see if we can fix that.
We want a sequence of tuples (start, length)
that we can pass to Substring
. We already have the lengths, so we just need to figure out the starts. The skeleton of our solution looks like this:
let segmentString lengths (input : string) =
let segments = Seq.zip ??? lengths
Seq.map (input.Substring : int * int -> string) segments
So where does a segment start? It starts at the sum of the previous lengths. We can use scan (+) 0
to calculate the partial sums of the lengths.
This gives us the final solution
let segmentString lengths (input : string) =
let segments = Seq.zip (Seq.scan (+) 0 lengths) lengths
Seq.map (input.Substring : int * int -> string) segments
-
\$\begingroup\$ That is the more functional solution I was looking for, thanks! \$\endgroup\$TeaDrivenDev– TeaDrivenDev2014年09月13日 14:10:04 +00:00Commented Sep 13, 2014 at 14:10
You can avoid that reversing by using manual recursion:
let segmentString lengths (input : string) =
let rec segmentString' start = function
| [] -> []
| length::lengthsTail ->
let segment = input.Substring(start, length)
let segmentsTail = segmentString' (start + length) lengthsTail
segment::segmentsTail
segmentString' 0 (List.ofSeq lengths)
I'm generally not in favor of using recursion directly, most of the time, you should use higher-order functions instead. But here, that higher-order function would be a hybrid between fold
and foldBack
(state has to be computed front to back, but result back to front), so I think that recursion makes sense here.
lengths
is smaller than the length ofinput
? Just ignore the rest of theinput
, like your code does? \$\endgroup\$