I recently started learning Haskell and as my first project I decided to port a particle simulation I had written in C.
The simulation is pretty simple. We have a cubic box with particles (spheres) inside it. Then we choose to either move a particle or to change the volume of the box and check for any collisions between the particles. If there are any collisions we keep the old configuration.
To speed up collision detection I have also implemented a 'cell linked list' which is in principle a uniform grid with a minimum size of a cell equal to the particle diameter. In the C code I have implemented this using arrays while in the Haskell code I use a Vector of Lists.
You can find my full code here, but here are some of the most important snippets,
The main Simulation Loop, where type Eval a = ReaderT (Env a) (StateT (SimState a) IO)
runSimulation :: Int -> Eval Double ()
runSimulation steps = do
Env npart ps <- ask
forM_ [1..steps] (\i -> do
forM_ [1..npart+1] (\_ -> do
SimState config _ variables <- get
selection <- liftIO $ getRandomR (0, npart)
if selection < npart
then do
rdx <- liftIO $ getRandomRs (0.0, _dx variables) >>= \r -> return $ take 3 r
rn <- liftIO $ getRandomR (0, npart - 1)
lift $ moveParticle (vec3fromList rdx) rn
else do
let dv = _dv variables
rdv <- liftIO $ getRandomR (-dv, dv)
racc <- liftIO $ getRandomR (0.0, 1.0)
let vol = boxVolume $ _box config
acc = exp $ (-ps) * rdv + fromIntegral npart * log ((vol + rdv) / vol)
when (racc < acc) $ changeVolume rdv
)
when (i `mod` 100 == 0) $ do
state@(SimState _ _ variables) <- get
let accVol = fromIntegral (_nVol variables) / 100.0
accMov = fromIntegral (_nMov variables) / (fromIntegral npart * 100)
olddx = _dx variables
olddv = _dv variables
newdx = (*) olddx (if accMov > 0.3 && olddx < 0.45 then 1.04 else 0.94)
newdv = (*) olddv (if accVol > 0.15 then 1.04 else 0.94)
liftIO $ print i
liftIO $ printf "dx = %.6f, dv = %.6f, nMov = %d, nVol = %d\n" newdx newdv (_nMov variables) (_nVol variables)
put $! state{_vars = variables{_dx = newdx, _dv = newdv, _nMov = 0, _nVol = 0}}
printConfig $ "Data/test" ++ show i ++ ".dat"
Here I'm effectively randomly choosing to either displace a particle or change the volume and every 100 steps I print useful information.
These two moves moveParticle
, changeVolume
are listed below,
moveParticle :: (RealFrac a, Floating a) => Vec3 a -> Int -> StateIO a ()
moveParticle dx nPart = do
SimState config@(Configuration box particles) cll variables <- get
let particle = particles V.! nPart
particle' = Particle $ applyBC $ _position particle .+. dx
cllIdx = cellIndex (_nCells cll) (_position particle')
isCollision = any (checkCollision box) [(particle', particles V.! pId) | pId <- neighbours cll cllIdx, pId /= nPart]
unless isCollision $ do --Accept move if there is no collision
let cllIdxOld = cellIndex (_nCells cll) (_position particle)
cll' = if cllIdx == cllIdxOld then cll else cllInsertAt (cllRemoveAt cll cllIdxOld nPart) cllIdx nPart
particles' = modifyVectorElement nPart particle' particles
put $! SimState config{_particles = particles'} cll' variables{_nMov = _nMov variables + 1}
changeVolume :: (Floating a, RealFrac a) => a -> Eval a ()
changeVolume dv = do
SimState config@(Configuration box particles) cll variables <- get
Env npart _ <- ask
let v = boxVolume box
box' = let scalef = ((v + dv) / v)**(1.0 / 3.0) in scaleBox scalef box
isCollision = any (checkCollision box') combinations
{-- A list of all the particles pairs that need to be checked for collision.
For each particle, the particles in the neighbouring cells are used. --}
combinations = do
pId <- [0..(npart - 1)]
let particle = particles V.! pId
cllIdx = cellIndex (_nCells cll) (_position particle)
pId' <- neighbours cll cllIdx
guard (pId /= pId')
return (particle, particles V.! pId')
unless isCollision $ do --Accept move if there is no collision
let new_cell_size = zipWith ((/) . fromIntegral) (_nCells cll) box'
new_n = map truncate box'
config' = config{_box = box'}
recreate = any (< 1.0) new_cell_size || any (> 2.0) new_cell_size || any (> 2) (zipWith ((abs .) . (-)) new_n (_nCells cll))
cll' = if recreate then generateCellList config' else cll
put $! SimState config' cll' variables{_nVol = _nVol variables + 1}
I would really appreciate a code review, especially since I just started learning Haskell. The code right now feels a bit messy. As you may have also noticed, I have given the accessor functions an underscore in their name with the intention to use Lenses later on.
I'm especially interested in any performance improvements. My Haskell implementation is up to 10 times slower than the C code which is unacceptable for a simulation. I should also mention that I've tried converting the Cell type to a Data.Seq and the Box type to a Vector but didn't see any performance improvements.
You can find the profiling report here and below I have also posted the space profiling graphs. I don't quite understand from these what is making the Haskell implementation so slow.
enter image description here
enter image description here
enter image description here
1 Answer 1
Heap profiling will not tell you much in this case - memory leaking is definitely not your problem. Using cost-centre profiling would probably be better suited for identification of the hot-spots.
After looking at the profile a bit (using home-grown tools), the worst offenders seem to be:
Lots of calls to
stg_newArray
, undoubtedly due to theCellList
vector getting copied bymodify
. The documentation says thatmodify
can update destructively, but I am pretty sure that this only refers to situations where the sourceVector
can be derived to be a temporary value available for fusion.To really get destructive updates, you would probably have to explicitly use a
MVector
and theST
monad. As you already have a monad around, the amount of refactoring involved wouldn't be crushing, but you would obviously lose a bit of purity.Additionally, the
neighbours
,getCell
andcellNeighbours
functions are getting hammered by your program. You are doing some pretty fancy list processing in there that GHC seems to not optimize very well - pretty much all these lists are actually generated and consumed. Helping GHC a bit by spelling out the loops might improve things.Finally, the default random number generator is slow - any time you use it you get a flurry of
Integer
allocations behind the scenes. Try to find a better one on Hackage.
Actually, I might have mis-interpreted the profile for point number two a bit - the following is actually a big no-no:
type CellIndex = [Int] -- The index of a cell [i, j, k]
Haskell can do a lot of optimization, but only if it has enough information. As it stands, getCell
will get a prefix like:
case y5 of wild18 {
[] -> lvl9;
: i1 ds8 -> case ds8 of wild19 {
[] -> lvl9;
: j1 ds9 -> case ds9 of wild20 {
[] -> lvl9;
: k1 ds10 -> case ds10 of wild21 {
[] -> case i1 of wild22 { GHC.Types.I# y6 ->
case j1 of wild23 { GHC.Types.I# y7 ->
case k1 of wild24 { GHC.Types.I# y8 -> ... } } };
: ipv1 ipv2 -> lvl9
} } } }
Meaning that it traverse the list, checks that it contains exactly three elements, and then unpacks every Int
separately. By passing (Int, Int, Int)
or your own Vec3
you would give Haskell enough information to derive that it can skip all the heap allocation.
Concerning my original point:
cellNeighbours (CellList !nCells _) cell = do
i <- [-1..1]
j <- [-1..1]
k <- [-1..1]
let temp = zipWith3 (((+).).(+)) nCells cell [i, j, k]
return $! zipWith mod temp nCells
Now this is arguably nice code - even though I personally don't like overly tricky (.)
constructions or usage of the list monad in the first place (list comprehensions!). The performance problem here is that this becomes something like follows after desugaring:
concatMap (\i -> concatMap (\j -> concatMap (\k -> [...]) [-1..1]) [-1..1]) [-1..1])
GHC will not unroll these, which means that you have constructed a multiple-level loop doing trivial list concatenations in a completely predictable manner.
If you for example wrote it as:
cellNeighbours (CellList (d,e,f) _) (x,y,z) = map add offsets
where add (a,b,c) = (x+a+d, y+b+e, z+c+f)
offsets = [(a,b,c) | a <- [-1..1], b <- [-1..1], c <- [-1..1]]
That would make offsets
a global constant that only gets evaluated once and would allow GHC to fuse the map
with the concatMap
in neighbours
, eliminating the intermediate list completely.
Some points on the structure:
You should try to isolate your simulation code as much as possible. Right now it is in "main.hs", which should normally just handle initialization and configuration. Splitting out the actual maths bits would be a pretty straightforward improvement.
On a similar note, you already have a monad, but aren't using it as an encapsulation tool - your core simulation code still uses
put
directly. Try to find out what "verbs" you are really using, then define some simple wrappers for those (sayincreaseParticleMoves
?). After you have eliminated all direct access, you can then export the monad and all actions to its own module, newtype-wrap it and stop exporting its definition. Having this sort of infrastructure would for example make the switch toST
much easier.
-
\$\begingroup\$ Hi Peter, thank you very much for taking the time to analyze my code and comment on it. \$\endgroup\$Grieverheart– Grieverheart2013年05月13日 17:54:57 +00:00Commented May 13, 2013 at 17:54
-
\$\begingroup\$ On point 2, I'm not sure what you're trying to say, could you be a bit more specific? I also depended on these functions being lazily evaluated, do you know if this is the case? On point 3, I did indeed notice
Integer
being used and later found out it is due toRandom
probably calling atime
function. Maybe I'm asking too much, but could you perhaps give me any tips beyond performance e.g. style and structure of the code? \$\endgroup\$Grieverheart– Grieverheart2013年05月13日 18:01:15 +00:00Commented May 13, 2013 at 18:01 -
\$\begingroup\$ I have added some more comments. And no, the problem with
Random
isn'ttime
, it's that it internally always first generates a largeInteger
random number and then cuts it down to what was actually requested. \$\endgroup\$Peter Wortmann– Peter Wortmann2013年05月14日 13:52:41 +00:00Commented May 14, 2013 at 13:52
Explore related questions
See similar questions with these tags.