2
\$\begingroup\$

I have a list of path I want to filter out. I want to keep only paths matching a specific pattern and remove all paths matching another specific pattern.

To achieve this I pattern matched on each pattern. Here is the code:

let stringToSpan (s: string) =
 System.ReadOnlySpan<char>(s.ToCharArray())
let matchWildcard pattern text =
 System.IO.Enumeration.FileSystemName.MatchesSimpleExpression(stringToSpan pattern, stringToSpan text)
let filterByPattern (includePattern: string Option) (excludePattern: string Option) (paths: string seq) :string seq =
 paths
 |> Seq.filter
 (fun path ->
 let shouldKeep =
 match includePattern with
 | Some pattern -> matchWildcard pattern path
 | None -> true
 if not shouldKeep then
 false
 else
 match excludePattern with
 | Some pattern -> matchWildcard pattern path |> not
 | None -> true)

You can try this snippet with:

seq {"test"; "foo/bar"; "foo/toto"}
|> filterByPattern (Some "foo/*") (Some "*toto*")
|> printfn "%A"
// output: foo/bar

If it matters, this snippet is used inside a dotnet tool to clone multiple git repositories. You can find this tool on GitHub.

I think that we could improve this snippet a bit. Can we?

asked May 22, 2021 at 10:04
\$\endgroup\$
2
  • \$\begingroup\$ Please follow the guidelines: codereview.stackexchange.com/help/how-to-ask \$\endgroup\$ Commented May 22, 2021 at 13:46
  • 1
    \$\begingroup\$ Don't call .ToCharArray() in your stringToSpan function. You're forcing it to allocate a character array on the heap when it's not needed. If there's nothing built into F# to support this, you should call System.MemoryExtensions.AsSpan(s) in this function instead. \$\endgroup\$ Commented Jun 1, 2021 at 22:37

3 Answers 3

3
\$\begingroup\$

This could be simplified by using straightforward composition:

let matches (x: string) (y: string) =
 System.IO.Enumeration.FileSystemName.MatchesSimpleExpression
 ( System.MemoryExtensions.AsSpan x
 , System.MemoryExtensions.AsSpan y
 )
seq {"test"; "foo/bar"; "foo/toto"}
|> Seq.filter (matches "foo/*")
|> Seq.filter (matches "*toto*" >> not)
|> printfn "%A"

And that's it! Just filter it twice.

This way you don't need to figure out what each of the arguments to filterByPattern mean. If you really want to make a function out of it:

let includeThenExclude include' exclude =
 Seq.filter (matches include') >> Seq.filter (matches exclude >> not)

If you only sometimes need to include or exclude, instead of Options:

let including x = matches x |> Seq.filter
let excluding x = matches x >> not |> Seq.filter
let includingFoo =
 seq {"test"; "foo/bar"; "foo/toto"}
 |> including "foo/*"
let includingFooExcludingToto =
 seq {"test"; "foo/bar"; "foo/toto"}
 |> including "foo/*"
 |> excluding "*toto*"
answered Jul 22, 2021 at 21:13
\$\endgroup\$
2
\$\begingroup\$

Lets focus on filterByPattern.

A Predicate is a function returning a boolean ('T -> bool). Since the predicate host most of the logic, lets start by spliting the Predicate from the filter to work on it.

let patternPredicate (includePattern: string Option) (excludePattern: string Option) (path: string) :bool =
 let shouldKeep =
 match includePattern with
 | Some pattern -> matchWildcard pattern path
 | None -> true
 if not shouldKeep then
 false
 else
 match excludePattern with
 | Some pattern -> matchWildcard pattern path |> not
 | None -> true
let filterByPattern (includePattern: string Option) (excludePattern: string Option) (paths: string seq) :string seq =
 Seq.filter (patternPredicate includePattern excludePattern) paths

Alright now we will stop modifying filterByPattern and focus on patternPredicate. Since we are only dealing with boolean maybe there is a refactoring to do here. Lets switch from pattern matching to basic boolean logic.

let patternPredicate (includePattern: string Option) (excludePattern: string Option) (path: string) :bool =
 let shouldKeep = includePattern.IsNone || (includePattern.IsSome && matchWildcard includePattern.Value path)
 if not shouldKeep then
 false
 else
 excludePattern.IsNone || (excludePattern.IsSome && (not (matchWildcard excludePattern.Value path))))

shouldKeep can be seen as (not a) or (a and b) where a is the option state and b the result of the matching.

Since we don't have to check twice the value of a, (not a) or (a and b) can be simplify to (not a) or b. If you want to verify this logic, you can use a Truth Table or rely on a tool like decode.fr's Boolean Expressions Calculator

docde output

Lets apply this to our function!

let patternPredicate (includePattern: string Option) (excludePattern: string Option) (path: string) :bool =
 let shouldKeep = includePattern.IsNone || (matchWildcard includePattern.Value path)
 if not shouldKeep then
 false
 else
 excludePattern.IsNone || (not (matchWildcard excludePattern.Value path))

Thats better! Now lets get rid of this if/else clause and some parenthesis also.

let patternPredicate (includePattern: string Option) (excludePattern: string Option) (path: string) :bool =
 (includePattern.IsNone || matchWildcard includePattern.Value path) &&
 (excludePattern.IsNone || not (matchWildcard excludePattern.Value path))

So the complete snippet is now:

let stringToSpan (s: string) =
 System.ReadOnlySpan<char>(s.ToCharArray())
let matchWildcard pattern text =
 System.IO.Enumeration.FileSystemName.MatchesSimpleExpression(stringToSpan pattern, stringToSpan text)
let patternPredicate (includePattern: string Option) (excludePattern: string Option) path =
 (includePattern.IsNone || matchWildcard includePattern.Value path) &&
 (excludePattern.IsNone || not (matchWildcard excludePattern.Value path)) 
let filterByPattern (includePattern: string Option) (excludePattern: string Option) (paths: string seq) :string seq =
 Seq.filter (patternPredicate includePattern excludePattern) paths
seq {"test"; "foo/bar"; "foo/toto"}
|> filterByPattern (Some "foo/*") (Some "*toto*")
|> printfn "%A"
// output: foo/bar
answered May 22, 2021 at 10:04
\$\endgroup\$
2
\$\begingroup\$

I would avoid using Option .IsNone and .Value, mainly because .Value will throw an exception if you made a mistake and called it at the wrong time. Almost all F# code will avoid that property and use pattern matching instead or some other function.

Here's another way to write the patternPredicate function, which I renamed to includePath do describe what it actually does:

let includePath (includePattern: string option) (excludePattern: string option) path =
 let isIncluded =
 match includePattern with
 | Some includePattern -> matchWildcard includePattern path
 | None -> true
 let isExcluded =
 lazy
 match excludePattern with
 | Some excludePattern -> matchWildcard excludePattern path
 | None -> false
 isIncluded && not isExcluded.Value

My approach here was to first build up the values isIncluded and isExcluded so that we can write isIncluded && not isExcluded.Value. This breaks the problem down into understandable and explicit parts.

Note the use of a lazy value for isExcluded. This means that it won't be evaluated unless isIncluded is true, so we don't need to call matchWildcard unnecessarily.

answered May 22, 2021 at 13:37
\$\endgroup\$
1
  • \$\begingroup\$ I really like the use of lazy here. \$\endgroup\$ Commented May 22, 2021 at 14:02

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.