OK, so I recently posted this solution in Python to the longest non-decreasing subsequence problem, and cleaned it up with some expert advice. As a next step, I wanted to translate this solution into Haskell. This presents a challenge because the algorithm as originally formulated depends heavily on mutable arrays with O(1) lookup and update behavior, which are standard in Python but a little harder to come by in Haskell.
Some excellent answers on Stackoverflow pointed me towards Data.Seq
as a mutable, indexable array type, and Numeric.Search.Range
for a generic binary search operation (which is key to making the algorithm efficient).
I wanted to express this algorithm as a fold over the input list because that seemed like a natural way to do it. However the Python version depends on many O(1) lookups to arbitrary places in the input list - not really possible within a Haskell fold over an ordinary list. I saw that this could be avoided by, rather than doing what the Python version does - namely storing indices into the input array of "the best subsequence of length n found so far" - storing the best actual subsequence as a Haskell list. At first sight, this seems incredibly wasteful, because at the end of the calculation for an input list of length n
there could be as many as n
"best subsequence" lists which are on average n / 2
long, creating space requirements of order n^2
! However, since these are Haskell lists derived from each other, they actually share a lot of the same space... for example, in the case where the input list is already a non-decreasing list of length n
, the calculation will involve creating n
subsequence lists of lengths [1..n], but each one shares a tail with its predecessor and the total space requirement is just proportional to n
.
The code is below. What I am looking for is: 1) general comments on readability, correctness, etc. and 2) a way to prove an upper bound on the space requirement.
import Data.Sequence as DS
import Numeric.Search.Range
seqLast::Seq a -> a
seqLast xs = index xs ((DS.length xs) - 1)
-- Return the longest non-decreasing subsequence of an input sequence of integers
nonDecreasing::[Int]->[Int]
nonDecreasing = Prelude.reverse . seqLast . makeM
where makeM = foldl updateM empty
where updateM m x | DS.null m = singleton [x]
| insert_idx == Nothing = m |> (x:(seqLast m))
| just_insert_idx == 0 = update 0 [x] m
| otherwise = update just_insert_idx (x:(index m (just_insert_idx - 1))) m
where insert_idx = searchFromTo (\idx -> x < (head (index m idx))) 0 ((DS.length m) - 1)
just_insert_idx = maybe 0 id insert_idx
-
\$\begingroup\$ I tried to run this code but I got the error below. How it can be updated to work with a recent version of Haskell? main.hs:2:1: error: Could not find module `Numeric.Search.Range' Use -v to see a list of the files searched for. \$\endgroup\$Raghad– Raghad2022年04月16日 19:09:12 +00:00Commented Apr 16, 2022 at 19:09
1 Answer 1
A few general remarks
I find that stepped whiles are as evil as nested if statements for readability of the code. Here, you have three levels of nesting. Relying on the surrounding environment for variables can some times obscure possibility to generalize (You make use of x and m for inner definitions)
So here is what I would do
import Data.Sequence as DS
import Numeric.Search.Range as R
seqLast::Seq a -> a
seqLast xs = index xs ((DS.length xs) - 1)
insert_idx :: Ord a => a -> Seq [a] -> Maybe Int
insert_idx y m = searchFromTo (fn y) 0 $ (DS.length m) - 1
where fn y idx = y < (head (index m idx))
-- Return the longest non-decreasing subsequence of an input sequence of integers
nonDecreasing::[Int] -> [Int]
nonDecreasing = Prelude.reverse . seqLast . makeM
where makeM = foldl updateM empty
updateM m x | DS.null m = singleton [x]
| otherwise = case insert_idx x m of
Nothing -> m |> (x: seqLast m)
Just l -> update l (x:geti l m) m
geti 0 _ = []
geti l m = index m (l -1)
-
\$\begingroup\$ I see your point about the nested
where
's being out of control... but doesn't a nestedwhere
at least make it clear thatinsert_idx
is not used anywhere else in the program? \$\endgroup\$gcbenison– gcbenison2012年04月26日 21:57:35 +00:00Commented Apr 26, 2012 at 21:57 -
\$\begingroup\$ Do you gain any thing by that knowledge? Nesting causes us to overlook places where we can create general functions that can be used elsewhere. (On the other hand, I fear I have no hard data to back that up. Perhaps a question for S.O?) \$\endgroup\$Rahul Gopinath– Rahul Gopinath2012年04月27日 00:31:09 +00:00Commented Apr 27, 2012 at 0:31