5
\$\begingroup\$

I would like to use Haskell to do some image processing so I have been writing small programs to test performance. I wrote a program to batch convert PNG images from color to greyscale. I compared the performance of the program below with a Python script using PIL. I found when compiled with -threaded -O2, the Haskell code takes four times longer than the python script (this comparison was done on a folder of 50 images).

Is there a way to speed this process up?

Haskell Code

import Codec.Picture
import System.Environment
import System.Directory
import System.FilePath
import Control.Monad
dynToImg :: DynamicImage -> Image PixelRGB8
dynToImg (ImageRGB8 img) = img
procAvg :: FilePath -> FilePath -> IO ()
procAvg inpath outpath = do
 Right inImage <- readPng inpath
 let image = avgImage . dynToImg $ inImage
 writePng outpath image
main :: IO ()
main = do
 args <- getArgs
 let [infolder,outfolder,tstr] = args
 t = read tstr :: Int
 files <- getDirectoryContents infolder
 let baseNames = filter ((==) ".png" . takeExtension) files
 infiles = map (infolder </>) baseNames
 outfiles = map (outfolder </>) baseNames
 zipWithM_ procAvg infiles outfiles
avgImage :: Image PixelRGB8 -> Image Pixel8
avgImage = pixelMap pixelAvg
pixelAvg :: PixelRGB8 -> Pixel8
pixelAvg (PixelRGB8 r g b) =
 (r `div` 3 + g `div` 3 + b `div` 3 +
 (r `mod` 3 + g `mod` 3 + b `mod` 3) `div` 3)

Python Code

from PIL import Image
import sys
import os
if __name__ == "__main__":
 infolder = sys.argv[1]
 outfolder = sys.argv[2]
 fnames = os.listdir(infolder)
 for f in fnames:
 infile = os.path.join(infolder,f)
 outfile = os.path.join(outfolder,f)
 inimg = Image.open(infile)
 gray = inimg.convert('L')
 gray.save(outfile)
 inimg.close()
asked May 14, 2016 at 6:48
\$\endgroup\$
0

1 Answer 1

4
\$\begingroup\$

Python

What does your Python code look like? I'm sure that most of the work is actually being done in C, not Python. JuicyPixels allows you to write an arbitrary pixel mapping function in Haskell. If the Python library called a user-supplied Python function to convert every pixel I can assure you that it wouldn't be as fast as it is.

Other Libraries

That said, JuicyPixels may not be the fastest Haskell library to use. For instance, Haskell can call routines in C libraries, and there are Haskell bindings to the ImageMagick library via the imagemagick package.

Another package you should have a look at is friday. On assorted algorithms it compares favorable with ImageMagick, It cannot produce SIMD instructions, so it's not as fast as libraries such as OpenCV. On the other hand, it gives you the tools to construct your own image transformations so it's a lot more versitile than a library which gives you just a handful of canned algorithms.

Improving pixelAvg

Obviously the function pixelAvg is the hot spot in the program, so anything we can do to optimize it will have a significant impact on the execution time. Here are some versions I tried:

pixelAvg (PixelRGB8 r g b) = 
-- (1) original version:
 (r `div` 3 + g `div` 3 + b `div` 3) + (r `mod` 3 + g `mod` 3 + b `mod` 3) `div` 3
-- (2) use quot and rem:
 (quot r 3 + quot g 3 + quot b 3) + (quot (rem r 3 + rem g 3 + rem b 3) 3)
-- (3) use quotRem:
 let (rq,rr) = quotRem r 3
 (gq,gr) = quotRem g 3
 (bq,br) = quotRem b 3
 in (rq+gq+bq + (rr+gr+br) `quot` 3)
-- (4) simplify the expression
 (div r 3 + div g 3 + div b 3)
-- (5) just use one division
 fromIntegral $ ((fromIntegral r + fromIntegral g + fromIntegral g) :: Int) `quot` 3
-- (6) use a lookup table
 V.unsafeIndex avgVector (fromIntegral r + fromIntegral g + fromIntegral b)

Here are some timings I obtained on a 4246 x 2856 image:

(1) 0m2.796s
(2) 0m2.768s
(3) 0m2.440s
(4) 0m2.297s
(5) 0m2.198s
(6) 0m2.117s

quot/rem avoids a sign check and therefore should be marginally faster than div/mod while giving the same answers for positive arguments. However an even bigger speed-up can be obtained by computing both the quotient and remainder with one function call (either quotRem or divMod) as demonstrated by (3).

Note that the formula used by (4) only differs from the original formula by at most 2 and has a lot fewer operations. Formula (5) reduces the number of divisions to just one by converting all three pixel values to Ints, adding and then dividing the total by 3.

As to be expected, the lookup table method is is the overall fastest. Here's the rest of the code that initializes the table:

import qualified Data.Vector.Unboxed as V
...
avgVector :: V.Vector Pixel8
avgVector = V.generate (3*256) (\x -> fromIntegral $ div x 3)

So there are significant gains to be made by optimizing pixelAvg, but there are also limitations to what you can achieve. To get better results you'll likely have to use a different library.

answered May 14, 2016 at 13:19
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.