I have a requirement where I need to extract pixels of a given rectangle within an image byte[]
.
More specifically I have an image where I have narrowed an area that I wish to return and keep and the remainder discard. In my case I'm performing face recognition in real time so I don't want to store unnecessary data but obviously this could be useful elsewhere.
public static byte[] ExtractPixelsFromImage(this byte[] pixels, int top, int left, int width, int height, int horizontalResolution)
{
List<byte> result = new List<byte>();
int bytesPerPixel = 4;
int rowLength = horizontalResolution * bytesPerPixel;
int rowLeftPixelStart = (rowLength * top) + (left * bytesPerPixel);
for (int i = 0; i < height; i++)
{
int currentPixelStart = (rowLength * i) + rowLeftPixelStart;
result.AddRange(pixels.Skip(currentPixelStart).Take(width * bytesPerPixel));
}
return result.ToArray();
}
This method has been responsible for the vast amount of CPU time my application is using and I'm hoping to improve this.
The obvious suggestion is to remove the LINQ, but beyond this, is there anything else that can be improved?
2 Answers 2
At a quick glance it would appear that your code is likely getting slowed down by the use of Linq
as well as conversion between List<byte>
and byte[]
. Only profiling and determining a reasonable amount of time for this code to execute is going to tell you the real answer, but here are some things you can try:
- Don't use
Skip
andTake
. Both of these methods operate onIEnumerable
, meaningSkip
may iterate from 0-currentPixelStart despite the fact that the array can be accessed by index directly. It's possible that this optimization is done by these extension methods, but some quick searching seems to indicate it is not (and you don't want to rely on it's implementation sticking that way). - Don't use
List<byte>
. For starters, one of the benefits of using aList
is that it can dynamically grow based on elements that are added. In most cases it does pretty well managing the overhead of resizing the internal collection, but if you need the absolute best performance you can just use a byte array. You can already calculate how big the resulting byte array will be, so just allocate the proper size from the start. This also means avoiding theresult.ToArray()
call sinceresult
will already be an array. - Make
bytesPerPixel
aconst
. This one isn't likely to gain you anything noticeable in the performance department, but other methods could make use of this value and right now it's hidden away in this method. - Don't recalculate the new row size on every
height
iteration (ie: movewidth * bytesPerPixel
outside the loop since it won't change).
-
\$\begingroup\$ Absolutely awesome advice. Implemented all the suggested changes and the speed increase is massive. The method CPU time has dropped from ~70% of the local CPU time to less than 0.1%. Huge difference! Thank you... \$\endgroup\$Maxim Gershkovich– Maxim Gershkovich2014年05月20日 22:57:39 +00:00Commented May 20, 2014 at 22:57
-
\$\begingroup\$ Np, glad to help! \$\endgroup\$Ocelot20– Ocelot202014年05月21日 20:39:23 +00:00Commented May 21, 2014 at 20:39
@Ocelot20 already give a good start for optimization. You should first implement his/her solution, before consider mine.
If you want some more boost, there are more possibilities. There is an article about this topic, but unfortunately in German. The result:
- You can use pointers in C# that give you a boost of about 12%. This is because unsafe code have no boundary checks.
- You can handle multiple pixels within one iteration of the for loop witch give you an incredible boost of 49%. The last example in the article shows this possibility. The boost came from the point, that you skip lots of boundary checks and i++ increments.
Furthermore you can use the second solution without points, if your are using the client profile which does not support unsafe code. You should test the code really hard when using unsafe code hard, because .Net will not help you at run-time.
By the way, I heard you can simplify the last example by using loop unrolling capabilities of the compiler.
for(var y=0; y<height; y++){
for(var x=0; x<width; x+=8){ //width has to be a multiple of 8
for(var i=0; i<8; i++){ //this should be unrolled by your compiler
//handle a single pixel here
}
}
}
Should be the same (in speed and function) as:
for(var y=0; y<height; y++){
for(var x=0; x<width; x+=8){ //width has to be a multiple of 8
//handle 8 pixels at once here
}
}
-
\$\begingroup\$ Isn't loop unrolling being done by the .NET JIT compiler? \$\endgroup\$M. Mimpen– M. Mimpen2014年05月20日 11:01:32 +00:00Commented May 20, 2014 at 11:01
-
\$\begingroup\$ Cheers for the input. I'll definitely try your suggestions as soon as I get some time. Love the idea of scraping multiple rows per pass, will need to try that... \$\endgroup\$Maxim Gershkovich– Maxim Gershkovich2014年05月20日 23:00:15 +00:00Commented May 20, 2014 at 23:00
List<byte> result = new List<byte>(bytesPerPixel * width * height);
\$\endgroup\$Buffer.BlockCopy
for fast array block copying. \$\endgroup\$