Convolution function

Asked 5 years, 3 months ago

Viewed 207 times

\$\begingroup\$

I implemented the following function using the matrix object from breeze library. Basically it is just a glorified while loop.

The convolution there is extremely slow so I rolled out my own, optimized on a kernel which is just a vector. This is already much faster but I figured there must be something else I can to to make this go faster.

The most expensive operation according to the profiler is the instantiation of the ArrayDeque (pointed by the arrow) which I don’t really need because all I wanted was a circular buffer, but could not find much in the library.

The second thing is calling until on the Int. I don't think that can be avoided.

Boxing the doubles is also taking quite some time, maybe there is a way to specialize?

Finally the majority of the time is taken by the function call itself (circled in red) which I don’t know what it means. Any insight?

def conv(m: DenseMatrix[Int], k: DenseVector[Double]): DenseMatrix[Double] = {
 val kData = k.data
 val dataIter = m.data.iterator
 val height = m.rows
 val convoluted = Array.newBuilder[Double]
 val prev = mutable.ArrayDeque.empty[Double]
 for (_ <- 1 until k.length) {
 prev.addOne(dataIter.next())
 }
 var count = k.length - 1
 while (dataIter.hasNext) {
 val cur = dataIter.next()
 val slice = prev.append(cur)
 if (count % height >= k.length - 1) {
 var r = 0D
 for (i <- 0 until k.length) {
 r += kData(i) * slice(i)
 }
 convoluted.addOne(r)
 }
 prev.removeHead()
 count += 1
 }
 DenseMatrix.create(m.rows - (k.length - 1), m.cols, convoluted.result())
 }

Following is the annotated flamechart, please note that fut is the above function conv. Everything else is unchanged.

flamechart

edited Jun 15, 2020 at 4:18

Jamal's user avatar

Jamal

35.2k13 gold badges134 silver badges238 bronze badges

asked Jun 15, 2020 at 0:33

gurghet's user avatar

gurghet gurghet

1413 bronze badges

\$\endgroup\$

Add a comment |

1 Answer 1

Sorted by: Reset to default

\$\begingroup\$

all I wanted was a circular buffer

Consider using an array accessed with mod.

That is, assign size s = k.length - 1, allocate a "previous" array p, and then access p[i % s].

If you're lucky, the compiler may notice that a portion of the code is just copying, and it will issue memcpy instructions.

answered Jun 15, 2020 at 1:18

J_H's user avatar

J_H J_H

41.4k3 gold badges38 silver badges157 bronze badges

\$\endgroup\$

Add a comment |

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-scala

Stack Exchange Network

Convolution function

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Convolution function

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions