Counting unordered factorizations with distinct parts

Question 1

Trying to solve this combinatorics problem, I wrote some functions to count the number of unordered distinct factorizations of an integer n into k distinct parts.

The following function builds up a list of unordered distinct factorizations of n with largest part at most m.

duf' :: Integer -> Integer -> [[Integer]]
duf' m 1 = [[]]
duf' 1 n = []
duf' 0 n = []
duf' m n | isPrime n = if m < n then [] else [[n]]
 | otherwise = concatMap (\x -> map (\y -> x : y) $ 
 duf' (x-1) (n `div` x) ) $ tail $ 
 takeWhile (<= m) $ toList $ divisors n

It is then used below to generate the list of all unordered distinct factorizations of n.

duf :: Integer -> [[Integer]]
duf n = duf' n n

To count the number of factorizations of k parts only I use a simple (and inefficient ?) filter.

pd n k = length . filter (\x -> k == length x) $ duf n

As a beginner, I have the feeling this is not really Haskell-ish. Any help to improve the syntax is welcome !

My concern is that I would like to avoid filtering and (if possible) to count without generating an enormous list. Is there a way to "merge" pd and duf' ? Or at least optimizing the thing ?

I use this two external functions : Math.NumberTheory.Primes.Factorisation.divisors and Data.Numbers.Primes.isPrime. Are there better choices ?

Question 2

I took a couple passes (gist) at this and managed to improve the style a bit, but I couldn't move the needle on performance at all.

Style

My thoughts and justifications on important changes are written as in-line comments below.

module Main where
import qualified Math.NumberTheory.Primes.Factorisation as Factorisation
import Data.Numbers.Primes (isPrime)
import Data.Set (toAscList)
duf' :: Integer -> Integer -> [[Integer]]
duf' _ 1 = [[]] -- Underscores in place of unused variable names
duf' 1 _ = [ ]
duf' 0 _ = [ ]
duf' m n | isPrime n = if m < n then [] else [[n]]
 | otherwise = concatMap (\x -> map (x:) $ duf' (x-1) (n `div` x)) -- Sectioning cons operator
 (tail . takeWhile (<= m) $ divisors n) -- Reflowed for legibility
 where
 divisors = toAscList . Factorisation.divisors -- Locally bound to decrease datatype mismatch line noise
duf :: Integer -> [[Integer]]
duf n = duf' n n
pd :: Integer -> Int -> Int
pd n k = length . filter ((== k) . length) $ duf n -- Function composition for ordering

Some reorganizing of layout in duf' made it a lot more comprehensible.

Performance

After rewriting duf', I noticed that the inner lists it returns are only ever really used for their length, so I came up with this version that tosses out the actual elements in favor of keeping a tally whenever a new element would be added instead.

duf' :: Integer -> Integer -> [Int]
duf' _ 1 = [0]
duf' 1 _ = [ ]
duf' 0 _ = [ ]
duf' m n | isPrime n = if m < n then [] else [1]
 | otherwise = concatMap (\x -> map (+1) $ duf' (x-1) (n `div` x))
 (tail . takeWhile (<= m) $ divisors n)
 where
 divisors = toAscList . Factorisation.divisors
duf :: Integer -> [Int]
duf n = duf' n n
pd :: Integer -> Int -> Int
pd n k = length . filter (== k) $ duf n

Unfortunately, this doesn't net us any wins in run time as I only really smeared out the inner call to length from filter in pd across duf' (this is less idiomatic, in my opinion). One potential positive is that this version may be more space efficient due to not keeping lists of elements around when all we really care about is the list spine anyway. I wasn't testing for that though, so I don't have any cold hard facts at hand and I suspect you'd have to use an argument strict version of (+1) to prevent building up chains of thunks before realizing any real decrease in memory.

Question 3

Could you elaborate a bit about using "an argument strict version of (+1)" ?

Question 4

Sure, you'd turn on {-# LANGUAGE BangPatterns #-} (put this compiler directive at the top of your file) then write a function plus !m !n = m + n. The exclamation points are strictness annotations enabled by the BangPatterns extension, when Haskell evaluates plus it will evaluate both arguments to WHNF (Weak Head Normal Form), which for Ints means they will be fully evaluated. If you have (1 + 2) + 3 GHC will normally defer evaluating (1 + 2) until you need the value of the whole expression, causing a space leak.

Question 5

plus (plus 1 2) 3 will be more eager, the outer plus will evaluate its arguments and so the only thunk hanging around will be for plus 3 3.

R B R B 1,0337 silver badges9 bronze badges · Accepted Answer · 2015-07-10 23:03:21Z

I took a couple passes (gist) at this and managed to improve the style a bit, but I couldn't move the needle on performance at all.

Style

My thoughts and justifications on important changes are written as in-line comments below.

module Main where
import qualified Math.NumberTheory.Primes.Factorisation as Factorisation
import Data.Numbers.Primes (isPrime)
import Data.Set (toAscList)
duf' :: Integer -> Integer -> [[Integer]]
duf' _ 1 = [[]] -- Underscores in place of unused variable names
duf' 1 _ = [ ]
duf' 0 _ = [ ]
duf' m n | isPrime n = if m < n then [] else [[n]]
 | otherwise = concatMap (\x -> map (x:) $ duf' (x-1) (n `div` x)) -- Sectioning cons operator
 (tail . takeWhile (<= m) $ divisors n) -- Reflowed for legibility
 where
 divisors = toAscList . Factorisation.divisors -- Locally bound to decrease datatype mismatch line noise
duf :: Integer -> [[Integer]]
duf n = duf' n n
pd :: Integer -> Int -> Int
pd n k = length . filter ((== k) . length) $ duf n -- Function composition for ordering

Some reorganizing of layout in duf' made it a lot more comprehensible.

Performance

After rewriting duf', I noticed that the inner lists it returns are only ever really used for their length, so I came up with this version that tosses out the actual elements in favor of keeping a tally whenever a new element would be added instead.

duf' :: Integer -> Integer -> [Int]
duf' _ 1 = [0]
duf' 1 _ = [ ]
duf' 0 _ = [ ]
duf' m n | isPrime n = if m < n then [] else [1]
 | otherwise = concatMap (\x -> map (+1) $ duf' (x-1) (n `div` x))
 (tail . takeWhile (<= m) $ divisors n)
 where
 divisors = toAscList . Factorisation.divisors
duf :: Integer -> [Int]
duf n = duf' n n
pd :: Integer -> Int -> Int
pd n k = length . filter (== k) $ duf n

Unfortunately, this doesn't net us any wins in run time as I only really smeared out the inner call to length from filter in pd across duf' (this is less idiomatic, in my opinion). One potential positive is that this version may be more space efficient due to not keeping lists of elements around when all we really care about is the list spine anyway. I wasn't testing for that though, so I don't have any cold hard facts at hand and I suspect you'd have to use an argument strict version of (+1) to prevent building up chains of thunks before realizing any real decrease in memory.

Could you elaborate a bit about using "an argument strict version of (+1)" ?
Sure, you'd turn on {-# LANGUAGE BangPatterns #-} (put this compiler directive at the top of your file) then write a function plus !m !n = m + n. The exclamation points are strictness annotations enabled by the BangPatterns extension, when Haskell evaluates plus it will evaluate both arguments to WHNF (Weak Head Normal Form), which for Ints means they will be fully evaluated. If you have (1 + 2) + 3 GHC will normally defer evaluating (1 + 2) until you need the value of the whole expression, causing a space leak.
plus (plus 1 2) 3 will be more eager, the outer plus will evaluate its arguments and so the only thunk hanging around will be for plus 3 3.

Stack Exchange Network

Counting unordered factorizations with distinct parts

1 Answer 1

Style

Performance

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Counting unordered factorizations with distinct parts

1 Answer 1

Style

Performance

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions