Number processing slower with asm.js

Question 1

I did a few times ago what I called a plasma in plain JavaScript:

enter image description here

The idea is to use 2 oscillators (LFO) which have their frequency + phase driven by two oscillators each. Then we use those 2 main oscillator output to decide of the r, g, b of each point.

To test the quite new API asm.js, I've ported it to be fully 'typed' and in complete memory isolation.

The big issue I encountered is that it is way slower with asm.js than with plain JavaScript. Worse: When I comment 'use asm' to have the browser interpret the code, I get better results (30% faster) than when the compilation succeeds.

The memory I use is:

A byte array of the size of the canvas: this is the output that I will putImageData on the canvas.
Right after I consider I have a float array that stores the values for the LFOs (base / amplitude / frequency / phase ), for 6 LFO.

I've made two versions, one fp32 and one fp64, none of which is fast (<50 ms per frame, when the plain object version is around 20ms).

Be sure to use a Browser that understands asm.js Firefox is one.

jsFiddle

After experimenting, a lot of time is spent evaluating the LFO value:

function oscillatorValueAt(heap_f64_Index, x) {
 heap_f64_Index = heap_f64_Index | 0;
 x = +x;
 var base = 0.0,
 amp = 0.0,
 freq = 0.0,
 phase = 0.0;
 heap_f64_Index = heap_f64_Index << 3;
 base = +heap_f64[(heap_f64_Index + 0) >> 3];
 amp = +heap_f64[(heap_f64_Index + 8) >> 3];
 freq = +heap_f64[(heap_f64_Index + 16) >> 3];
 phase = +heap_f64[(heap_f64_Index + 24) >> 3];
 return +(base + amp * sin(freq * x + phase));
}

This is called several times per pixel in the tickPixels sub, with calls like:

vaty = +oscillatorValueAt(osc1, yD);

Could you help me to find why there isn't a speed boost?

Edit : When the compilation to asm.js is a success, you see in the console :

Successfully compiled asm.js code (total compilation time 0ms; not stored in cache)

On the other hand, if one does not follow asm.js convention, one gets a message like :

TypeError: asm.js type error: must be of the form +x, fround(x), simdType(x) or x|0

If you see this message, the code is interpreted as plain javascript

Question 2

IIRC there is a significant cost when passing values between asm.js and non-asm.js code. Let me find the source for that...

Question 3

I'd be pleased to know, yet unless i'm wrong oscillatorValueAt is the main culprit and is used only within the scope of the module. I'd bet it is inlined when interpreted/jited, and not inlined when compiled, which would explain the performance gap.

Question 4

Interestingly, try removing "use asm" on the float64 example without touching anything else and see if the performance improves.

Question 5

Yes, that's what i tell in my post, WITHOUT asm it is 30% faster... ???

Question 6

There could be a number of factors here. First, most of the computation happens in a tiny kernel - asm.js isn't necessary to make that fast, usually the normal JIT can do just as well. asm.js really shines on large applications with lots of hot code.

And asm.js might win or lose on such tiny kernels just because the register allocator might emit something a little different. Small kernels are very noisy that way.

But, the issue here looks like something else. Check out this, which modifies that code to add a little "counter" which adjusts the value sent into sin(). Actually, most of the time spent in this benchmark is in sin(), as shown by the profiler. And in Firefox, sin() is actually a little different in asm.js mode and not - when not, there is a cache which helps speed up simple code like you see in the SunSpider benchmark. All browsers implemented such caches because they sometimes really help. But so far Firefox doesn't have such a cache on asm.js. Anyhow, the counter I added makes it so the cache is less effective - a much wider range of values is sent into sin(), and with that change, asm.js is significantly faster. Or, to put it another way: asm.js is about as fast as before, but non-asm.js is much slower, due to the cache no longer helping.

What are the implications? If you need this code to be very fast, you might just implement your own sin() cache in asm.js. You can make it as low-precision as you want, so it could be even more efficient than the browser's general and precise cache.

Even better, you can implement sin() in asm.js as well, once more making it as low-precision as you can tolerate. Right now browsers call out to libc for sin(), which tends to be very precise, and a lower-precision asm.js one could beat that. It would also be more consistent between users, since no libc differences. (We are considering shipping sin() with emscripten's libc, for these reasons, but haven't so far.)

Question 7

Thanks for your precious comments. It seems asm.js is not mature yet. I will try to speed-up the sin function with an approximation next week and update my post.

Alon Zakai Alon Zakai 1212 bronze badges · Answer 1 · 2015-09-04 20:52:42Z

There could be a number of factors here. First, most of the computation happens in a tiny kernel - asm.js isn't necessary to make that fast, usually the normal JIT can do just as well. asm.js really shines on large applications with lots of hot code.

And asm.js might win or lose on such tiny kernels just because the register allocator might emit something a little different. Small kernels are very noisy that way.

But, the issue here looks like something else. Check out this, which modifies that code to add a little "counter" which adjusts the value sent into sin(). Actually, most of the time spent in this benchmark is in sin(), as shown by the profiler. And in Firefox, sin() is actually a little different in asm.js mode and not - when not, there is a cache which helps speed up simple code like you see in the SunSpider benchmark. All browsers implemented such caches because they sometimes really help. But so far Firefox doesn't have such a cache on asm.js. Anyhow, the counter I added makes it so the cache is less effective - a much wider range of values is sent into sin(), and with that change, asm.js is significantly faster. Or, to put it another way: asm.js is about as fast as before, but non-asm.js is much slower, due to the cache no longer helping.

What are the implications? If you need this code to be very fast, you might just implement your own sin() cache in asm.js. You can make it as low-precision as you want, so it could be even more efficient than the browser's general and precise cache.

Even better, you can implement sin() in asm.js as well, once more making it as low-precision as you can tolerate. Right now browsers call out to libc for sin(), which tends to be very precise, and a lower-precision asm.js one could beat that. It would also be more consistent between users, since no libc differences. (We are considering shipping sin() with emscripten's libc, for these reasons, but haven't so far.)

Thanks for your precious comments. It seems asm.js is not mature yet. I will try to speed-up the sin function with an approximation next week and update my post.

Stack Exchange Network

Number processing slower with asm.js

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Number processing slower with asm.js

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions