As a simple demonstration of the efficiency of Haskell style, I thoughtlessly ran the following:
take 100 [(a, b, c) | a <- [1..], b <- [1..], c <- [1..], a^2 + b^2 == c^2]
This should be a way of deriving the first 100 Pythagorean triples, with duplicates. In practice however, it never halts, because the algorithm itself defies lazy evaluation.
To think about it in terms of actual implementation, the following should be something similar to how the list comprehension is actually evaluated, in an imperative style:
results = []
for (a = 0; a < ∞; a++) {
for (b = 0; b < ∞; b++) {
for (c = 0; c < ∞; c++) {
if (a^2 + b^2 == c^2) {
results[] = [a, b, c]
}
}
}
}
When written like this, it becomes obvious that the function can never yield results, because infinite time will be spent testing whether 1^2 + 1^1 == c^2
, as only the innermost for
loop will advance, and a
and b
will remain '1'.
The common solution in this particular case is to constrain the values of the smallest two variables to that of the largest:
take 100 [(a, b, c) | c <- [1..],
a <- [1..c], b <- [1..c],
a^2 + b^2 == c^2]
However, this seems like an obvious oversight for implementors of the language. When you think about it, any list comprehension with more than one infinite source of search space will never halt, for the same reason, except some will yield useful results when (1, 1, x) is useful. There are questions discussing this problem, but most discuss specific cases, rather than the problem overall. Why isn't fixing this within the language with a different iteration pattern trivial?
-
The question is: can you find a general strategy that works for all possible cases? Can a compiler analyze the snippet of code you provided and guess that no solution will ever exist when a or b is greater than c? However I am not sure myself how to answer this properly. Maybe someone could answer you on cstheory.stackexchange.com (see e.g. cstheory.stackexchange.com/questions/7327/…)coredump– coredump2015年08月13日 09:41:25 +00:00Commented Aug 13, 2015 at 9:41
-
See "dovetaling". It is non-trivial because trivial interpretation of list comprehensions is as nested loops which are trivially enumerated in a trivial top-down non-interlaced odometer-like fashion.Will Ness– Will Ness2023年04月14日 18:19:49 +00:00Commented Apr 14, 2023 at 18:19
3 Answers 3
It is certainly possible to define an iteration order for x1 <- [1..], x2 <- [1..], ..., xK <- [1..]
that will reach any tuple (c1, c2, ..., cK)
after some finite amount of steps, and tries each tuple only once. Take any computable bijection between the natural numbers N and the k-tuples of natural numbers, N^k. Such a bijection exists, as the Cartesian product of countable sets is again countable. One example would be repeatedly applying the (inverse of the) Cantor pairing function. The precise bijection being used would have to be specified by the Haskell report, because it affects the order in which results are found.
However, this is of limited utility, because it would only cover the very specific pattern with [1..]
. There are countless other ways to produce infinte lists, and not only is it far more involved to define a suitable iteration order for those cases, it is also impossible for the compiler to detect infinite lists in all cases (Rice's theorem). So you'd have to define some heuristic for when this transformation applies. In the interest of being feasible to describe and implement across multiple interpreters and compilers, this would likely be a very simplistic heuristic.
So what you are left with is a subtle change to the semantics of a program, with no indication in the program source code, that only happens when some arcane and limited heuristic fires. Thanks, I'll pass. Just explicitly write the program to do what you want.
-
I like this answer. However, assuming that iteration (in Haskell) is always implemented as a [0..] list that is filtered for actual conditions required would mean that any iteration order that works for [1..] would work for any other problem involving the same number of infinite spaces. I.e. given that the process of counting integers always starts with the integers from 0 (or 1) to infinity, which is then reduced as needed, a more helpful question would be given x number of infinite spaces, is there an iteration order that can reach any (n1, n2 ... nx) tuple given n <- [1..n] in any case?TheEnvironmentalist– TheEnvironmentalist2015年08月14日 06:13:34 +00:00Commented Aug 14, 2015 at 6:13
-
@TheEnvironmentalist There's more to infinite lists than infinite lists of integers, though. Why are those special? (Hint: They aren't.)user7043– user70432015年08月14日 11:27:07 +00:00Commented Aug 14, 2015 at 11:27
-
True, they're not the only type of infinite list, but if the factor preventing this change is the need for custom improvements for each case, this assumption allows us to generalize, and therefore make it more practical. In fact, considering any (rational) countable decimal can be expressed as an integer divided by some constant, the decimals fall under the same generalization. Come to think of it, most countable types could be represented by integers, making this about as general a case as you can get.TheEnvironmentalist– TheEnvironmentalist2015年08月14日 11:42:40 +00:00Commented Aug 14, 2015 at 11:42
-
All countable types can be represented by (non-negative) integers in some way. That is what it means to be countable. And all types expressible in Haskell (and indeed the set of all possible values in any computational process) are countable. The question is whether the compiler can infer an appropriate bijection, respecting the numerous invariants that most types have (e.g. rationals usually want to be irreducible, floating point-esque numbers want to be normalized, etc.) and whether the order is both (1) understandable and (2) the order the programmer wants.user7043– user70432015年08月14日 12:05:25 +00:00Commented Aug 14, 2015 at 12:05
-
For example, with most bijections the code in your questions would not find the first N Pythagorean triples, just some triples in an order that is meaningless unless you keep the exact bijection in mind. And none of this even begins to address the question of how you would decide when to apply this tranformation.user7043– user70432015年08月14日 12:05:58 +00:00Commented Aug 14, 2015 at 12:05
A short answer: You just need to use a different monad (omega), for lists this is working as intended.
To elaborate: For list comprehensions, you want to have the results in a specific order. And while for your use-case you want a different order, for others this order is what they need and want.
The problem is that if you do diagonalization, like the omega monad does, it's quite hard to decide how to sequence it. Consider that:
- The lists might contain complex sequences, like powers of two. How would you prioritize then arithmetic vs geometric sequences?
- The lists might contain values of any type, which might not even be ordered.
Given the above, even if you have a guarantee that you'll eventually find all combinations in a finite time, in practice this will still mean that one particular one might take really long to find.
With omega and monad comprehensions you can easily express your problem as
test :: [(Integer, Integer, Integer)]
test = take 100 $ runOmega [(a, b, c) | a <- each [1..]
, b <- each [1..]
, c <- each [1..]
, a^2 == b^2 + c^2]
However you'll see that even though it eventually finds all triples, it won't find the "good ones" in a reasonable time.
For a real use-case, I'd suggest to use the LogicT monad. There you have explicit control over diagonalization using interleave
and >>-
.
Well the simplest, practical answer, which does take experience to appreciate, is that this situation is at most a minor annoyance. A more elaborate answer, which takes still more experience to appreciate, comprises first two observations and a conclusion.
First observation: the Haskell community sees list comprehensions as a concrete application of monads. The features offered by list comprehension coincide with the those provided by MonadPlus
instance for lists, and GHC has a MonadComprehensions
extension that allows you to use the comprehension syntax with any monad.
Second observation: there is a classic technique for interleaving infinite lists, due to famous mathematician Georg Cantor's proof that the set of natural numbers has the same cardinality ("size") as the set of rational numbers. This can be easily translated into Haskell:
{-# LANGUAGE DeriveFunctor, TupleSections #-}
import Control.Applicative
cantor :: [a] -> [b] -> [(a, b)]
cantor [] _ = []
cantor _ [] = []
cantor (a:as) (b:bs) =
(a,b) : interleave (map (a,) bs) (interleave (map (,b) as) (cantor as bs))
interleave :: [a] -> [a] -> [a]
interleave [] ys = ys
interleave (x:xs) ys = x : interleave ys xs
And we might try to define an alternative list Monad
instance based on that. First we define a wrapper type Cantor
:
newtype Cantor a = Cantor [a]
deriving (Eq, Functor)
Monad
is a subclass of Applicative
, so first we define an Applicative
instance for Cantor
:
instance Applicative Cantor where
pure a = Cantor [a]
Cantor fs <*> Cantor as = Cantor (map (uncurry ($)) (cantor fs as))
...but here we've already gone wrong, because it turns out that this violates the laws ("contract", in OO terms) of the Applicative
class. In Applicative
, the following unit test must succeed no matter what values are passed in as arguments:
-- | One of the consequences of the associative law for 'Applicative'.
-- This function must return true no matter what arguments are passed
-- in. Haskell's QuickCheck unit testing library runs tests in this
-- format, and this one will fail if we run it on 'Cantor'.
associativity
:: (Applicative f, Eq (f (a, (b, c)))) => f a -> f b -> f c -> Bool
associativity as bs cs =
as `pair` (bs `pair` cs) == fmap assoc ((as `pair` bs) `pair` cs)
pair :: Applicative f => f a -> f b -> f (a, b)
pair = liftA2 (,)
assoc :: ((a, b), c) -> (a, (b, c))
assoc ((a, b), c) = (a, (b, c))
That this test fails can be verified very easily by just running a few simple examples. Note that pair
in the Cantor
type is the same as the cantor
function as above, so we can simplify this test to this:
associativity' :: (Eq a, Eq b, Eq c) => [a] -> [b] -> [c] -> Bool
associativity' as bs cs =
as `cantor` (bs `cantor` cs) == map assoc ((as `cantor` bs) `cantor` cs)
So Cantor
is not a law-abiding Applicative
. It follows it cannot be a law-abiding Monad
either.
Now the conclusion is the following: these two observations, put together, are an answer to your question. Making list comprehensions use cantor
or some similar technique to work in the infinite lists case would mean abandoning the idea that list comprehensions are an application of monads. But Haskellers think that monads are too valuable a technique to neglect for a corner case like this.
-
So the technical reason is that it either violates the self-consistency of monads, or abandons them if favor of adopting list comprehension, essentially syntactic sugar, as an all-new feature with no precedent. Well done, my friend. Not a technical explanation of why it can't be done, but an explanation of why it won't.TheEnvironmentalist– TheEnvironmentalist2015年09月14日 20:23:49 +00:00Commented Sep 14, 2015 at 20:23
Explore related questions
See similar questions with these tags.