|
|
|
optimizations for small drawRect calls
Patch Set 1 #Patch Set 2 : #Patch Set 3 : #Patch Set 4 : #Patch Set 5 : #Patch Set 6 : #Total messages: 6
|
reed1
Likely I will commit these piece-meal, but here is the sum of my experiments so ...
|
13 years, 8 months ago (2012年05月10日 20:25:00 UTC) #1 |
Likely I will commit these piece-meal, but here is the sum of my experiments so far to speed up small drawRects (simulating dashing). Bench runs are very noisy :( but I'm seeing ~25% faster on dash_4_rect benchmark. Will run some timings on linux (64bit) before I commit anything Want to get some aggregate feeling of perf change on all benches (but how?)
~ 20% faster on linux (when I ran with -03 for both runs)
The templates add a *lot* of code complexity to BlitRow_D32. Do they really yield that much more performance than a naive unrolling? The only thing I see you're winning is that the odd-man-out part of the loop (1..3) is unrolled. For noisy timings, what -repeat count are you using? My rule of thumb is -repeat 50 for 5% noise, -repeat 150 for 1%.
Good question about the value of the 1-3 tail. I will try expanding the test to time w/ and w/o those being templated.