Return to Answer

replaced http://stackoverflow.com/ with https://stackoverflow.com/

edited May 23, 2017 at 12:41

http://stackoverflow.com/questions/4489518/efficient-number-reading-in-haskell https://stackoverflow.com/questions/4489518/efficient-number-reading-in-haskell

http://stackoverflow.com/questions/4489518/efficient-number-reading-in-haskell

https://stackoverflow.com/questions/4489518/efficient-number-reading-in-haskell

added 780 characters in body

Source Link

edited Aug 1, 2015 at 21:35

ErikR

edited Aug 1, 2015 at 21:35

ErikR

1.9k
10
7

Using Ubboxed Vectors

UpdateI've found another important optimization - you want to use unboxed vectors instead of the regular vectors. Here is an alternate version of genVec:

import qualified Data.Vector.Unboxed as UnboxedV
import qualified Data.Vector.Unboxed.Mutable as UnboxedM
genVec :: [Complex] -> UnboxedV.Vector Int
genVec xs = runST $ do
 mv <- UnboxedM.replicate (imgSize*imgSize) (0::Int)
 forM_ xs $ \c -> do
 let x = computeCIndex c
 count <- UnboxedM.unsafeRead mv x
 UnboxedM.unsafeWrite mv x (count+1)
 UnboxedV.freeze mv

I think you'll find that improvingThis will cut down the Double parsingrun time by another couple of seconds (which is aboutnow significant since the best you can do for a single threaded programwhole pipeline takes now only takes about 4 secs to run.)

Update:

~~(削除) I think you'll find that improving the Double parsing is about the best you can do for a single threaded program. (削除ここまで)~~ To scale to 600M points you are going to have to use multiple threads / machines. Fortunately this is a classic map-reduce problem, so there's a lot of tools and libraries (not necessarily in Haskell) that you can draw upon.

Update:

I think you'll find that improving the Double parsing is about the best you can do for a single threaded program. To scale to 600M points you are going to have to use multiple threads / machines. Fortunately this is a classic map-reduce problem, so there's a lot of tools and libraries (not necessarily in Haskell) that you can draw upon.

Using Ubboxed Vectors

I've found another important optimization - you want to use unboxed vectors instead of the regular vectors. Here is an alternate version of genVec:

import qualified Data.Vector.Unboxed as UnboxedV
import qualified Data.Vector.Unboxed.Mutable as UnboxedM
genVec :: [Complex] -> UnboxedV.Vector Int
genVec xs = runST $ do
 mv <- UnboxedM.replicate (imgSize*imgSize) (0::Int)
 forM_ xs $ \c -> do
 let x = computeCIndex c
 count <- UnboxedM.unsafeRead mv x
 UnboxedM.unsafeWrite mv x (count+1)
 UnboxedV.freeze mv

This will cut down the run time by another couple of seconds (which is now significant since the whole pipeline takes now only takes about 4 secs to run.)

Update:

added 763 characters in body

Source Link

edited Aug 1, 2015 at 20:04

ErikR

edited Aug 1, 2015 at 20:04

ErikR

1.9k
10
7

Update:

If you just want to scale on a single machine using multiple threads, you can put together your own solution using a module like Control.Monad.Par. For a clustering solution you'll probably have to use a third-party framework like Hadoop in which case you might be interested in the hadron package - there is also a video describing it here: https://vimeo.com/90189610

Update:

Source Link

answered Aug 1, 2015 at 19:09

ErikR

answered Aug 1, 2015 at 19:09

ErikR

1.9k
10
7

lang-hs