10
\$\begingroup\$

Trying to solve this combinatorics problem, I wrote some functions to count the number of unordered distinct factorizations of an integer n into k distinct parts.

The following function builds up a list of unordered distinct factorizations of n with largest part at most m.

duf' :: Integer -> Integer -> [[Integer]]
duf' m 1 = [[]]
duf' 1 n = []
duf' 0 n = []
duf' m n | isPrime n = if m < n then [] else [[n]]
 | otherwise = concatMap (\x -> map (\y -> x : y) $ 
 duf' (x-1) (n `div` x) ) $ tail $ 
 takeWhile (<= m) $ toList $ divisors n

It is then used below to generate the list of all unordered distinct factorizations of n.

duf :: Integer -> [[Integer]]
duf n = duf' n n

To count the number of factorizations of k parts only I use a simple (and inefficient ?) filter.

pd n k = length . filter (\x -> k == length x) $ duf n

As a beginner, I have the feeling this is not really Haskell-ish. Any help to improve the syntax is welcome !

My concern is that I would like to avoid filtering and (if possible) to count without generating an enormous list. Is there a way to "merge" pd and duf' ? Or at least optimizing the thing ?

I use this two external functions : Math.NumberTheory.Primes.Factorisation.divisors and Data.Numbers.Primes.isPrime. Are there better choices ?

asked Jun 11, 2015 at 8:10
\$\endgroup\$

1 Answer 1

4
+250
\$\begingroup\$

I took a couple passes (gist) at this and managed to improve the style a bit, but I couldn't move the needle on performance at all.

Style

My thoughts and justifications on important changes are written as in-line comments below.

module Main where
import qualified Math.NumberTheory.Primes.Factorisation as Factorisation
import Data.Numbers.Primes (isPrime)
import Data.Set (toAscList)
duf' :: Integer -> Integer -> [[Integer]]
duf' _ 1 = [[]] -- Underscores in place of unused variable names
duf' 1 _ = [ ]
duf' 0 _ = [ ]
duf' m n | isPrime n = if m < n then [] else [[n]]
 | otherwise = concatMap (\x -> map (x:) $ duf' (x-1) (n `div` x)) -- Sectioning cons operator
 (tail . takeWhile (<= m) $ divisors n) -- Reflowed for legibility
 where
 divisors = toAscList . Factorisation.divisors -- Locally bound to decrease datatype mismatch line noise
duf :: Integer -> [[Integer]]
duf n = duf' n n
pd :: Integer -> Int -> Int
pd n k = length . filter ((== k) . length) $ duf n -- Function composition for ordering

Some reorganizing of layout in duf' made it a lot more comprehensible.

Performance

After rewriting duf', I noticed that the inner lists it returns are only ever really used for their length, so I came up with this version that tosses out the actual elements in favor of keeping a tally whenever a new element would be added instead.

duf' :: Integer -> Integer -> [Int]
duf' _ 1 = [0]
duf' 1 _ = [ ]
duf' 0 _ = [ ]
duf' m n | isPrime n = if m < n then [] else [1]
 | otherwise = concatMap (\x -> map (+1) $ duf' (x-1) (n `div` x))
 (tail . takeWhile (<= m) $ divisors n)
 where
 divisors = toAscList . Factorisation.divisors
duf :: Integer -> [Int]
duf n = duf' n n
pd :: Integer -> Int -> Int
pd n k = length . filter (== k) $ duf n

Unfortunately, this doesn't net us any wins in run time as I only really smeared out the inner call to length from filter in pd across duf' (this is less idiomatic, in my opinion). One potential positive is that this version may be more space efficient due to not keeping lists of elements around when all we really care about is the list spine anyway. I wasn't testing for that though, so I don't have any cold hard facts at hand and I suspect you'd have to use an argument strict version of (+1) to prevent building up chains of thunks before realizing any real decrease in memory.

answered Jul 10, 2015 at 23:03
\$\endgroup\$
3
  • \$\begingroup\$ Could you elaborate a bit about using "an argument strict version of (+1)" ? \$\endgroup\$ Commented Jul 12, 2015 at 14:45
  • 1
    \$\begingroup\$ Sure, you'd turn on {-# LANGUAGE BangPatterns #-} (put this compiler directive at the top of your file) then write a function plus !m !n = m + n. The exclamation points are strictness annotations enabled by the BangPatterns extension, when Haskell evaluates plus it will evaluate both arguments to WHNF (Weak Head Normal Form), which for Ints means they will be fully evaluated. If you have (1 + 2) + 3 GHC will normally defer evaluating (1 + 2) until you need the value of the whole expression, causing a space leak. \$\endgroup\$ Commented Jul 12, 2015 at 18:32
  • \$\begingroup\$ plus (plus 1 2) 3 will be more eager, the outer plus will evaluate its arguments and so the only thunk hanging around will be for plus 3 3. \$\endgroup\$ Commented Jul 12, 2015 at 18:33

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.