7
\$\begingroup\$

I try to find the center of all the red pixels contained in an image using Haskell and repa.

My problem is, that the code is not fast enough, probably because I read the image with repa-io which returns an Array Z DIM3 Word8, but I was not able to figure out how to filter the repa array by how red the given pixels are.

That's why I convert the array to a normal list, using R.toList. The rest of the calculations are then done using lists.

Here is my code:

import Data.Array.Repa.Repr.ForeignPtr (F, fromForeignPtr, toForeignPtr)
import Data.Array.Repa (Z(..), (:.)(..), DIM0, DIM1, DIM2, DIM3, Array(..),
 Source, Shape)
import qualified Data.Array.Repa as R
import Data.Array.Repa.IO.DevIL
import Data.List
import System.Environment (getArgs)
main :: IO ()
main = do
 [path] <- getArgs
 (RGBA arr) <- runIL $ readImage path
 putStrLn $ "red center " ++ (show . findRedDot $ arr)
-- | find the red dot
findRedDot :: (Ord a, Num a, Source r a) => Array r DIM3 a -> (Int, Int)
findRedDot img = (median . fst $ splitted, median . snd $ splitted)
 where -- function returning the coordinates as a tuple, when the pixel
 -- is red enough, otherwise return (-1, -1)
 convert f (Z :. i :. j) = let r = f (Z :. i :. j :. 0)
 g = f (Z :. i :. j :. 1)
 b = f (Z :. i :. j :. 2) in
 if r > 237 && (g + b) < 50
 then (i, j)
 else (-1, -1)
 -- turn the 3rd dimension into tuples containing the coordinates
 -- (see convert function)
 packed = R.traverse img
 (\(Z :. w :. h :. _) -> Z :. w :. h)
 convert
 -- turns the DIM2 repa array into a default haskell list
 lst = R.toList packed
 p (-1, -1) = False
 p _ = True
 -- split up the list into x and y component
 splitted = splitUp $ filter p lst
-- | split up a list of tuples into two lists
splitUp :: [(Int, Int)] -> ([Int], [Int])
splitUp [] = ([], [])
splitUp ((x, y):rest) = (x:(fst next), y:(snd next))
 where next = splitUp rest
-- | find the median of a numeric list
median :: (Num a, Ord a) => [a] -> a
median [] = -1
median l = sorted !! mid
 where len = length l
 mid = len `quot` 2
 sorted = sort l

I did compile with ghc -O2 -threaded -prof -fprof-auto Main.hs.

Running this with ./Main.hs example.png +RTS -p gives me the correct result after some time, when using a 1920*1080 image which is completely red (This is the worst case, because finding the median is very time consuming).

Content of Main.prof

Mon May 8 21:26 2017 Time and Allocation Profiling Report (Final)
 test5 +RTS -p -RTS fully-red-extreme.png
 total time = 0.81 secs (809 ticks @ 1000 us, 1 processor)
 total alloc = 2,248,665,408 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
median.sorted Main 48.2 44.7
splitUp Main 10.5 13.3
findRedDot.convert Main 10.0 2.2
findRedDot.lst Main 7.0 19.2
splitUp.next Main 6.4 2.2
findRedDot.splitted Main 3.5 5.2
findRedDot.convert.r Main 3.2 4.4
main Main 2.8 0.0
findRedDot.convert.b Main 2.6 4.4
findRedDot.convert.g Main 1.4 4.4
median.len Main 1.4 0.0
 individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 58 0 0.1 0.0 100.0 100.0
 main Main 117 0 2.8 0.0 99.9 100.0
 findRedDot Main 118 1 0.0 0.0 97.0 100.0
 findRedDot.splitted Main 123 1 3.5 5.2 21.3 20.7
 splitUp Main 130 2073601 10.5 13.3 16.9 15.5
 splitUp.next Main 133 2073600 6.4 2.2 6.4 2.2
 findRedDot.p Main 124 2073600 0.9 0.0 0.9 0.0
 findRedDot.lst Main 122 1 7.0 19.2 25.2 34.7
 findRedDot.packed Main 125 0 1.0 0.0 18.2 15.5
 findRedDot.convert Main 126 2073600 10.0 2.2 17.2 15.5
 findRedDot.convert.b Main 129 2073600 2.6 4.4 2.6 4.4
 findRedDot.convert.g Main 128 2073600 1.4 4.4 1.4 4.4
 findRedDot.convert.r Main 127 2073600 3.2 4.4 3.2 4.4
 findRedDot.packed Main 120 1 0.0 0.0 0.0 0.0
 findRedDot.packed.\ Main 121 1 0.0 0.0 0.0 0.0
 median Main 119 2 1.0 0.0 50.6 44.7
 median.sorted Main 134 2 48.2 44.7 48.2 44.7
 median.len Main 132 2 1.4 0.0 1.4 0.0
 median.mid Main 131 2 0.0 0.0 0.0 0.0
 CAF Main 115 0 0.0 0.0 0.0 0.0
 main Main 116 1 0.0 0.0 0.0 0.0
 CAF GHC.IO.Encoding 102 0 0.0 0.0 0.0 0.0
 CAF GHC.IO.Handle.FD 100 0 0.0 0.0 0.0 0.0
 CAF GHC.Event.Thread 98 0 0.0 0.0 0.0 0.0
 CAF GHC.IO.Encoding.Iconv 94 0 0.0 0.0 0.0 0.0
 CAF GHC.Conc.Signal 84 0 0.0 0.0 0.0 0.0
 CAF GHC.IO.Handle.Text 79 0 0.0 0.0 0.0 0.0

The execution time is definitely longer than the one given in the .prof file and I don't know whether the sorting of the list can be accelerated in some way.

The most important goal is to spare the conversion from a repa array into a list and to do the whole computation with repa arrays.

200_success
145k22 gold badges190 silver badges478 bronze badges
asked May 8, 2017 at 19:51
\$\endgroup\$
4
  • \$\begingroup\$ Looking at the docs, Data.Repa.Array.Auto, Data.Repa.Array.Generic and Data.Repa.Array.Material each define filter. \$\endgroup\$ Commented May 9, 2017 at 6:01
  • \$\begingroup\$ Small remark: That's the alpha of version 4, @Gurkenglas. Erich uses version 3. repa-array is not an official variant (yet). \$\endgroup\$ Commented May 9, 2017 at 6:30
  • \$\begingroup\$ I would prefer to use the official version, but I will definitely have a look at this. \$\endgroup\$ Commented May 9, 2017 at 7:27
  • \$\begingroup\$ Ok, I looked at the docs, but I think the Array type has completely changed. Could you give me a hint where to look for instructions? Or could you give me a short example? \$\endgroup\$ Commented May 9, 2017 at 15:23

1 Answer 1

2
\$\begingroup\$

Small things I noticed:

  • splitUp is already defined in the prelude as unzip.
  • median working on a list makes it slow and your implementation is not semantically correct.
    Generally the median is defined as the "center" element for datasets of odd magnitude, or the arithmetic mean of the two center elements for datasets of even magnitude.

    If you made it use an array, you'd also get the vast benefit of not needing the len traverse the full list, nor to traverse it when accessing a specific index, which should cut execution time for that function roughly in half.

    According to the profiling, that will also vastly reduce overall execution time.

  • The argument to convert f in findRedDot is repeated unnecessarily a lot, it should be possible to do the following:

    convert f a@(Z:. i :. j) = let r = f (a :. 0)
     g = f (a :. 1)
     b = f (a :. 2) in
     if r > 237 && (g+b) < 50
     then (i,j) 
     else (-1, -1)
    

    This makes it a tad easier to understand what exactly you do here, IMO

  • You speak of not being able to figure out a way to filter the repa array. A bit of hoogling reveals selectP, which should help with that. They explicitly mention:

    Produce an array by applying a predicate to a range of integers. If the predicate matches, then use the second function to generate the element.

    • This is a low-level function helpful for writing filtering operations on arrays.
    • Use the integer as the index into the array you're filtering.
answered Oct 10, 2017 at 8:25
\$\endgroup\$
2
  • \$\begingroup\$ selectP does only work for integers, but I have to filter the array by looking at the last dimension in a whole. What is your suggestion concerning the median problem, I don't really see, what you are trying to say. \$\endgroup\$ Commented Oct 10, 2017 at 16:31
  • \$\begingroup\$ @Erich your median implementation iterates the list thrice. If you had an array, that'd more likely be around 1.7 times for sorting and then a constant-time access to the median element(s). Re: selectP: Read the quoted section again. This is what they use for filtering arrays. Reread if you don't understand it yet... \$\endgroup\$ Commented Oct 10, 2017 at 16:43

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.