This is a C++ program which outputs a Mandelbrot fractal image in a graphics window and lets the user zoom and pan around the image, generating a new image each time the user does so. I'm using SFML for the graphics part.
Because it generates a new image after every zoom or pan, it is pretty slow, especially with a 1000x600 image (it can take up to half a second for the new image to load).
#include <SFML/Graphics.hpp>
#include <cmath>
const int IMAGE_WIDTH = 1000;
const int IMAGE_HEIGHT = 600;
double zoom = 0.004; // allow the user to zoom in and out...
double offsetX = -0.7; // and move around
double offsetY = 0.0;
const int MAX = 127; // maximum number of iterations for mandelbrot()
// don't increase MAX or the colouring will look strange
int mandelbrot(double, double, int);
sf::Color getColor(int);
int main() {
sf::RenderWindow window(sf::VideoMode(IMAGE_WIDTH, IMAGE_HEIGHT), "Mandelbrot");
window.setFramerateLimit(30);
sf::Image image;
image.create(IMAGE_WIDTH, IMAGE_HEIGHT, sf::Color(0, 0, 0));
sf::Texture texture;
sf::Sprite sprite;
bool stateChanged = true; // track whether the image needs to be regenerated
while (window.isOpen()) {
sf::Event event;
while (window.pollEvent(event)) {
switch (event.type) {
case sf::Event::Closed:
window.close();
break;
case sf::Event::KeyPressed:
stateChanged = true; // image needs to be recreated when the user changes zoom or offset
switch (event.key.code) {
case sf::Keyboard::Escape:
window.close();
break;
case sf::Keyboard::Equal:
zoom *= 0.9;
break;
case sf::Keyboard::Dash:
zoom /= 0.9;
break;
case sf::Keyboard::W:
offsetY -= 40 * zoom;
break;
case sf::Keyboard::S:
offsetY += 40 * zoom;
break;
case sf::Keyboard::A:
offsetX -= 40 * zoom;
break;
case sf::Keyboard::D:
offsetX += 40 * zoom;
break;
default: break;
}
default:
break;
}
}
if (stateChanged) { // only generate a new image if something has changed, to avoid unnecessary lag
for (int x = 0; x < IMAGE_WIDTH; x++) {
for (int y = 0; y < IMAGE_HEIGHT; y++) {
// convert x and y to the appropriate complex number
double real = (x - IMAGE_WIDTH / 2.0) * zoom + offsetX;
double imag = (y - IMAGE_HEIGHT / 2.0) * zoom + offsetY;
int value = mandelbrot(real, imag, MAX);
image.setPixel(x, y, getColor(value));
}
}
texture.loadFromImage(image);
sprite.setTexture(texture);
}
window.clear();
window.draw(sprite);
window.display();
stateChanged = false;
}
return 0;
}
int mandelbrot(double startReal, double startImag, int maximum) {
int counter = 0;
double zReal = startReal;
double zImag = startImag;
double nextRe;
while (pow(zReal, 2.0) + pow(zImag, 2.0) <= 4.0 && counter <= maximum) {
nextRe = pow(zReal, 2.0) - pow(zImag, 2.0) + startReal;
zImag = 2.0 * zReal * zImag + startImag;
zReal = nextRe;
if (zReal == startReal && zImag == startImag) { // a repetition indicates that the point is in the Mandelbrot set
return -1; // points in the Mandelbrot set are represented by a return value of -1
}
counter += 1;
}
if (counter >= maximum) {
return -1; // -1 is used here to indicate that the point lies within the Mandelbrot set
} else {
return counter; // returning the number of iterations allows for colouring
}
}
sf::Color getColor(int iterations) {
int r, g, b;
if (iterations == -1) {
r = 0;
g = 0;
b = 0;
} else if (iterations == 0) {
r = 255;
g = 0;
b = 0;
} else {
// colour gradient: Red -> Blue -> Green -> Red -> Black
// corresponding values: 0 -> 16 -> 32 -> 64 -> 127 (or -1)
if (iterations < 16) {
r = 16 * (16 - iterations);
g = 0;
b = 16 * iterations - 1;
} else if (iterations < 32) {
r = 0;
g = 16 * (iterations - 16);
b = 16 * (32 - iterations) - 1;
} else if (iterations < 64) {
r = 8 * (iterations - 32);
g = 8 * (64 - iterations) - 1;
b = 0;
} else { // range is 64 - 127
r = 255 - (iterations - 64) * 4;
g = 0;
b = 0;
}
}
return sf::Color(r, g, b);
}
The window immediately after it is opened:
The window after zooming in (using the + key) and panning upwards (using the W key):
-
2\$\begingroup\$ What are your review goals? Speed? Readability? \$\endgroup\$Nobody moving away from SE– Nobody moving away from SE2016年03月31日 12:37:24 +00:00Commented Mar 31, 2016 at 12:37
-
1\$\begingroup\$ @Nobody I'm more familiar with Java and Python, so I'm interested in making my code more consistent with C++ style. Readability is also concern for me. Speed is less of a priority for me, as I think I have a rough idea of where to improve. \$\endgroup\$laurencevs– laurencevs2016年03月31日 12:57:08 +00:00Commented Mar 31, 2016 at 12:57
-
2\$\begingroup\$ You call window.clear(), window.draw() and window.display() after executing window.close(), do you really want to do that? \$\endgroup\$pacmaninbw– pacmaninbw ♦2016年03月31日 13:23:09 +00:00Commented Mar 31, 2016 at 13:23
-
7\$\begingroup\$ Half a second o_O thinks wistfully back in the times it took a full 5 minutes. How much we have progressed.. \$\endgroup\$Tschallacka– Tschallacka2016年04月01日 09:19:07 +00:00Commented Apr 1, 2016 at 9:19
-
1\$\begingroup\$ Also instead of pixel-wise manipulation and calling a function, is there some option to save an image buffer and then replace the whole image? \$\endgroup\$mathreadler– mathreadler2016年04月02日 17:13:24 +00:00Commented Apr 2, 2016 at 17:13
7 Answers 7
I see some things that may help you improve your code.
Move loop invariants outside the loop
To maximize performance, moving loop invariants outside the loop often helps. This is an optimization that many compilers can make on their own, but it helps to write it explicitly. In particular, the nested loop within main
where the image is recalculated. Right now the loop looks like this:
for (int x = 0; x < IMAGE_WIDTH; x++) {
for (int y = 0; y < IMAGE_HEIGHT; y++) {
// convert x and y to the appropriate complex number
double real = (x - IMAGE_WIDTH / 2.0) * zoom + offsetX;
double imag = (y - IMAGE_HEIGHT / 2.0) * zoom + offsetY;
int value = mandelbrot(real, imag, MAX);
image.setPixel(x, y, getColor(value));
}
}
However, with a little bit of algebra, we see that the real
and imag
calculations could be rewritten as
double real = x * zoom - IMAGE_WIDTH / 2.0 * zoom + offsetX;
double imag = y * zoom - IMAGE_HEIGHT / 2.0 * zoom + offsetY;
Since only x
and y
are actually changing within the loop, it's actually simple to replace all of that with this:
double real = 0 * zoom - IMAGE_WIDTH / 2.0 * zoom + offsetX;
double imagstart = 0 * zoom - IMAGE_HEIGHT / 2.0 * zoom + offsetY;
for (int x = 0; x < IMAGE_WIDTH; x++, real += zoom) {
double imag = imagstart;
for (int y = 0; y < IMAGE_HEIGHT; y++, imag += zoom) {
// convert x and y to the appropriate complex number
int value = mandelbrot(real, imag, MAX);
image.setPixel(x, y, getColor(value));
}
}
Avoid special cases
The value of -1
is used to represent points within the Mandelbrot set, but then this value is explicitly checked when assigning a color and is equivalent to MAX
. Why not simply assign the value of MAX
and avoid the special case?
Prefer simple lookup to code
The getColor()
function gets a single number in the range of 0 to MAX (if the previous suggestion is taken) and returns a color. Why not replace this with a table lookup? Compute the table values once (either at compile-time or run-time) and then use the table. At the beginning of main
you could have this:
std::array<sf::Color, MAX+1> colors;
for (int i=0; i <= MAX; ++i) {
colors[i] = getColor(i);
}
Then the loop would use this:
image.setPixel(x, y, colors[value]);
Don't use ==
for floating point numbers
Using ==
with floating point numbers is generally not a good idea because if they differ only by some small amount (say 1e-99), they are not equal. In fact, if you modify the code to count the number of times that statement evaluates to true
, I think you will find that it never does and is therefore simply wasting time.
For more depth about floating point issues, I'd recommend the excellent article "What every computer scientist should know about floating-point arithmetic" by David Goldberg.
Optimize loops for early bailout
The mandelbrot
loop can be rewritten more simply like this:
int mandelbrot(double startReal, double startImag) {
double zReal = startReal;
double zImag = startImag;
for (int counter = 0; counter < MAX; ++counter) {
double r2 = zReal * zReal;
double i2 = zImag * zImag;
if (r2 + i2 > 4.0) {
return counter;
}
zImag = 2.0 * zReal * zImag + startImag;
zReal = r2 - i2 + startReal;
}
return MAX;
}
Several changes were made here. First, since the original third parameter maximum
was always passed in with the value of MAX
, it was simply eliminated from the parameter list and the value MAX
substituted directly.
Next, the loop is now a for
loop instead of a while
loop which makes the structure more obvious.
Next, the r2
and i2
values are precomputed, making the code a little shorter and simpler (and probably more efficient).
Lastly, the early bailout is done within the loop. Only if we get to the end of the loop does the value MAX
get returned.
Omit work not needed
The window.clear()
call near the end of main
is not needed since the following line writes the entire window anyway. That line can simply be eliminated.
Eliminate global variables
All but MAX
can be made local to within main
.
Eliminate framerate limit
Unless there is a reason to do so, you might find that eliminating the call to window.setFramerateLimit
or setting the value to 0
allows the program to be more responsive. After I applied all of the suggestions mentioned above, I found that the framerate limit was the thing preventing further speed increases on my machine.
Don't update the screen if not needed
Right now any key (even those that have no effect, cause the screen to recalculate. This is easily fixed by adding stateChanged = false;
to the default case in the key event handling switch
statement.
Use C++11 threads
One can fairly painlessly parallelize this code. I have changed the end of main so that it now looks like this:
if (stateChanged) {
updateImage(zoom, offsetX, offsetY, image);
texture.loadFromImage(image);
sprite.setTexture(texture);
stateChanged = false;
}
window.draw(sprite);
window.display();
}
}
As you can see, the nested loops have been replaced by a call to updateImage
. It now looks like this:
void updateImage(double zoom, double offsetX, double offsetY, sf::Image& image)
{
std::vector<std::thread> threads;
for (int i = 0; i < IMAGE_HEIGHT; i += 100) {
threads.push_back(std::thread(updateImageSlice, zoom, offsetX, offsetY,
std::ref(image), i, i+100));
}
for (auto &t : threads) {
t.join();
}
}
The updateImageSlice
does exactly what it claims to do and looks like this:
void updateImageSlice(double zoom, double offsetX, double offsetY, sf::Image& image, int minY, int maxY)
{
double real = 0 * zoom - IMAGE_WIDTH / 2.0 * zoom + offsetX;
double imagstart = minY * zoom - IMAGE_HEIGHT / 2.0 * zoom + offsetY;
for (int x = 0; x < IMAGE_WIDTH; x++, real += zoom) {
double imag = imagstart;
for (int y = minY; y < maxY; y++, imag += zoom) {
int value = mandelbrot(real, imag);
image.setPixel(x, y, colors[value]);
}
}
}
With this, things are noticably faster and look good until I get to a zoom factor of around 2e-16 or smaller, where there is noticable banding due to the previously mentioned floating point issues. A more careful reworking of the calculations could eliminate or defer that, but I didn't take the time to do so.
Also I should point out that sharing a single reference to the image
variable is not generally the way to handle shared threads because it leads to at least the potential of a data race and corruption. However, as I guessed, the structure of the image
object is such that the threads do not seem to contend for the same memory locations.
Revised version
Per request in the comments, here's the entire thing. This actually goes a few steps farther in that it turns most of the program into a Mandelbrot
object and also scales the number of threads per std::thread::hardware_concurrency
. It requires a C++11 compliant compiler.
#include <SFML/Graphics.hpp>
#include <array>
#include <vector>
#include <thread>
static constexpr int IMAGE_WIDTH = 1000;
static constexpr int IMAGE_HEIGHT = 600;
class Mandelbrot {
public:
Mandelbrot();
void updateImage(double zoom, double offsetX, double offsetY, sf::Image& image) const;
private:
static const int MAX = 127; // maximum number of iterations for mandelbrot()
// don't increase MAX or the colouring will look strange
std::array<sf::Color, MAX+1> colors;
int mandelbrot(double startReal, double startImag) const;
sf::Color getColor(int iterations) const;
void updateImageSlice(double zoom, double offsetX, double offsetY, sf::Image& image, int minY, int maxY) const;
};
Mandelbrot::Mandelbrot() {
for (int i=0; i <= MAX; ++i) {
colors[i] = getColor(i);
}
}
int Mandelbrot::mandelbrot(double startReal, double startImag) const {
double zReal = startReal;
double zImag = startImag;
for (int counter = 0; counter < MAX; ++counter) {
double r2 = zReal * zReal;
double i2 = zImag * zImag;
if (r2 + i2 > 4.0) {
return counter;
}
zImag = 2.0 * zReal * zImag + startImag;
zReal = r2 - i2 + startReal;
}
return MAX;
}
sf::Color Mandelbrot::getColor(int iterations) const {
int r, g, b;
// colour gradient: Red -> Blue -> Green -> Red -> Black
// corresponding values: 0 -> 16 -> 32 -> 64 -> 127 (or -1)
if (iterations < 16) {
r = 16 * (16 - iterations);
g = 0;
b = 16 * iterations - 1;
} else if (iterations < 32) {
r = 0;
g = 16 * (iterations - 16);
b = 16 * (32 - iterations) - 1;
} else if (iterations < 64) {
r = 8 * (iterations - 32);
g = 8 * (64 - iterations) - 1;
b = 0;
} else { // range is 64 - 127
r = 255 - (iterations - 64) * 4;
g = 0;
b = 0;
}
return sf::Color(r, g, b);
}
void Mandelbrot::updateImageSlice(double zoom, double offsetX, double offsetY, sf::Image& image, int minY, int maxY) const
{
double real = 0 * zoom - IMAGE_WIDTH / 2.0 * zoom + offsetX;
double imagstart = minY * zoom - IMAGE_HEIGHT / 2.0 * zoom + offsetY;
for (int x = 0; x < IMAGE_WIDTH; x++, real += zoom) {
double imag = imagstart;
for (int y = minY; y < maxY; y++, imag += zoom) {
int value = mandelbrot(real, imag);
image.setPixel(x, y, colors[value]);
}
}
}
void Mandelbrot::updateImage(double zoom, double offsetX, double offsetY, sf::Image& image) const
{
const int STEP = IMAGE_HEIGHT/std::thread::hardware_concurrency();
std::vector<std::thread> threads;
for (int i = 0; i < IMAGE_HEIGHT; i += STEP) {
threads.push_back(std::thread(&Mandelbrot::updateImageSlice, *this, zoom, offsetX, offsetY, std::ref(image), i, std::min(i+STEP, IMAGE_HEIGHT)));
}
for (auto &t : threads) {
t.join();
}
}
int main() {
double offsetX = -0.7; // and move around
double offsetY = 0.0;
double zoom = 0.004; // allow the user to zoom in and out...
Mandelbrot mb;
sf::RenderWindow window(sf::VideoMode(IMAGE_WIDTH, IMAGE_HEIGHT), "Mandelbrot");
window.setFramerateLimit(0);
sf::Image image;
image.create(IMAGE_WIDTH, IMAGE_HEIGHT, sf::Color(0, 0, 0));
sf::Texture texture;
sf::Sprite sprite;
bool stateChanged = true; // track whether the image needs to be regenerated
while (window.isOpen()) {
sf::Event event;
while (window.pollEvent(event)) {
switch (event.type) {
case sf::Event::Closed:
window.close();
break;
case sf::Event::KeyPressed:
stateChanged = true; // image needs to be recreated when the user changes zoom or offset
switch (event.key.code) {
case sf::Keyboard::Escape:
window.close();
break;
case sf::Keyboard::Equal:
zoom *= 0.9;
break;
case sf::Keyboard::Dash:
zoom /= 0.9;
break;
case sf::Keyboard::W:
offsetY -= 40 * zoom;
break;
case sf::Keyboard::S:
offsetY += 40 * zoom;
break;
case sf::Keyboard::A:
offsetX -= 40 * zoom;
break;
case sf::Keyboard::D:
offsetX += 40 * zoom;
break;
default:
stateChanged = false;
break;
}
default:
break;
}
}
if (stateChanged) {
mb.updateImage(zoom, offsetX, offsetY, image);
texture.loadFromImage(image);
sprite.setTexture(texture);
stateChanged = false;
}
window.draw(sprite);
window.display();
}
}
-
\$\begingroup\$ Did you measure that multithreaded implementation? The overhead of starting one thread per column of the image might be too fine granular. Furthermore, it is very likely that starting IMAGE_WIDTH threads is too much for usual CPUs which will result in unecessary many context switches. \$\endgroup\$Nobody moving away from SE– Nobody moving away from SE2016年03月31日 19:14:59 +00:00Commented Mar 31, 2016 at 19:14
-
2\$\begingroup\$ Yes I measured it and it's faster. I think you've misread the code. It's not one thread per column of the image which would be 1000 threads, but rather one thread per 100 horizontal lines which is 6 threads for this code. \$\endgroup\$Edward– Edward2016年03月31日 19:17:52 +00:00Commented Mar 31, 2016 at 19:17
-
\$\begingroup\$ Yes, I missed the
+= 100
. Yet that is a magic number and does not scale automatically with different image sizes/number of cores. \$\endgroup\$Nobody moving away from SE– Nobody moving away from SE2016年03月31日 19:19:34 +00:00Commented Mar 31, 2016 at 19:19 -
\$\begingroup\$ Could you give a full listing of your final code? I would like to run measurements on it and I have difficulties recreating it. \$\endgroup\$Nobody moving away from SE– Nobody moving away from SE2016年03月31日 19:41:12 +00:00Commented Mar 31, 2016 at 19:41
-
\$\begingroup\$ @Nobody: Sure! I look forward to your timing results. I posted the full code to my answer. \$\endgroup\$Edward– Edward2016年03月31日 21:35:01 +00:00Commented Mar 31, 2016 at 21:35
Namings
- Unlike in Python, all-caps names are usually left to macros (which you should avoid, by the way) in C++; for example:
M_PI
,CHAR_BIT
and so on.
Moden Language Features
There are several places where you can take advantage of modern C++ features:
constexpr
: pure constants and magic numbers should be declared asconstexpr
.static constexpr int image_width = 1000; static constexpr int image_height = 600; static constexpr int max_iterations = 127;
Constants could also be collected in a namespace (e.g.
namespace constants
) or underenum
s.Uniform initialization: you could use it to simplify some parts by not writing explicit types. In
getColor
, prefer:sf::Color getColor(int iterations) { int r, g, b; // ... return {r, g, b}; }
The same holds for
sf::RenderWindow
/sf::Color
:sf::RenderWindow window { {IMAGE_WIDTH, IMAGE_HEIGHT}, "Mandelbrot"};
Performance
A minor optimization for getColor
would try to reduce conditional assignments by giving r, g, b
default values and so changing them only if necessary:
sf::Color getColor(int iterations) {
int r = g = b = 0;
if (iterations == -1) {
// nothing to do
} else if (iterations == 0) {
r = 255;
// g = 0;
// b = 0;
} else {
// colour gradient: Red -> Blue -> Green -> Red -> Black
// corresponding values: 0 -> 16 -> 32 -> 64 -> 127 (or -1)
if (iterations < 16) {
r = 16 * (16 - iterations);
// g = 0;
b = 16 * iterations - 1;
} else if (iterations < 32) {
// r = 0;
g = 16 * (iterations - 16);
b = 16 * (32 - iterations) - 1;
} else if (iterations < 64) {
r = 8 * (iterations - 32);
g = 8 * (64 - iterations) - 1;
// b = 0;
} else { // range is 64 - 127
r = 255 - (iterations - 64) * 4;
// g = 0;
// b = 0;
}
}
Other Details
- Globals should generally be avoided;
- The
maximum
mandelbrot
could default toMAX
i.e.
int mandelbrot(double startReal, double startImag, int maximum = ::MAX)
zoom
should have a maximum level and therefore you should ensure it's not broken (double
-s aresigned
).
-
\$\begingroup\$ Could you elaborate on
constexpr
? Why is it preferred overconst
? \$\endgroup\$jacwah– jacwah2016年03月31日 18:42:29 +00:00Commented Mar 31, 2016 at 18:42 -
2\$\begingroup\$ @jacwah
const
tells the compiler such value won't change throughout program's life; it doesn't however say when such constant will be assigned. It's perfectly fine if it's done at run-time.constexpr
does not allow this: values have to strictly be constant expressions and the compiler shall ensure that. For more info, see here. \$\endgroup\$edmz– edmz2016年03月31日 19:02:14 +00:00Commented Mar 31, 2016 at 19:02 -
-
\$\begingroup\$ @jacwah You'd therefore prefer it over
const
if you're sure about the data you have. It's also required in some contexts (e.g. templates and meta-programming in general). It might also help produce faster code. \$\endgroup\$edmz– edmz2016年03月31日 19:20:32 +00:00Commented Mar 31, 2016 at 19:20
I'm going to comment mostly on performance aspects of your code here. Stylistic parts of your code should be improved upon by someone more conversant in C++ than I am.
Magic Numbers
You have factored out some constants, but when you calculate zooming, 0.9
and 40
are "magic numbers", numbers with no explanation as to what they do, and those whose usage entails updating values at multiple locations when you change them. Maybe you would like to make them constants and call them ZOOM_DOWN_FACTOR
and ZOOM_UP_FACTOR
respectively.
Performance
I realize compiling with
-O3
and-ffastmath
might make your compiler turn thosepow()
calls into plain multiplications, which are faster, so compile for release with those options. To compensate for floating-point inaccuracies, you should factor in a small Epsilon (maximum allowed deviation) value of about 10-15. Try changing your bailout condition to <= 4.0 + 10-15.One pointer for faster panning: factor out your image generation code into a different method from
main()
, which accepts parameters defining the range of pixels of the image which should be subjected to Mandelbrot calculation. Call this function on every pan event with the range of pixels which have been invalidated due to the pan, get this image with some blank pixels and merge it with the old image to get the panned image. Should make panning orders of magnitude faster.
Miscellaneous
counter >= maximum
is unnecessary. An==
will do just fine here.- Suggestion: Use
return
instead ofbreak
after executingwindow.close()
in the relevantcase
statements. Should avoid a potential memory leak. (Thanks @pacmaninbw for bringing this to my notice).
Minor useful tips from a (reasonably) experienced fractal programmer:
(me, of course. Look here, especially as you are familiar with Java)
- Implement a suitable form of color smoothing. Linear interpolation should help. Its a bit bit expensive on performance, but it vastly improves image quality. Exponential smoothing should help provide a suitable interpolation value.
This is the formula: expiter += exp(-cabs(z)-0.5/(cabs(zold - z)))
.
Here, exp(x)
is the standard exponential function e^x, your math library has it. zold
and z
are complex numbers, zold
is the value of z
from the previous iteration. cabs(z)
returns the square of the modulus of z
(basically the sum of the squares of the real and imaginary components of z
). expiter
is the fractional or smooth iteration count.
For each RGB component, apply linear interpolation as follows:
value = toValue * expiter + fromValue * ( 1 - expiter)
. toValue
and fromValue
are basically the values of the said RGB component for iterations
and iterations + 1
-
3\$\begingroup\$ +1 for
-O3
. I remember when-O3
outdid my hand optimizations on a raytracer by an order of magnitude when I started learning C++ and tried it for the first time. \$\endgroup\$Nobody moving away from SE– Nobody moving away from SE2016年03月31日 13:04:23 +00:00Commented Mar 31, 2016 at 13:04 -
\$\begingroup\$ Thank you, this is all very helpful! Colour smoothing sounds good, so I'll look into it soon. \$\endgroup\$laurencevs– laurencevs2016年03月31日 14:26:58 +00:00Commented Mar 31, 2016 at 14:26
Use library functionality
Mandelbrot images are calculated with complex numbers. C++ offers the <complex>
header which contains a template for generating complex numbers out of floats or other numeric types.
Initialize at definition
Many compilers can give you warnings about code paths with uninitialized variables but in general it might be good to zero initialize your r
, g
, and b
values to zero at the beginning of the getColor
function. This also allows to remove all later zero assignments in the if
s.
Cascading if
s
Often it is a bad idea to cascade if
s in the way you have done.
Since your code uses equal spacing in multiples of 16 you could instead use a switch
:
if(iterations < 0) {
} else {
switch(iterations / 16) {
case 0: // deal with 0-15
break;
case 1: // deal with 16-31
break;
case 2: // deal with 32-47
/* no break! */
case 3: // deal with 48-63
break;
default: // deal with iterations >= 64
}
}
Parallelize
The results of different pixels are independent of each other. This screams for parallelization. It might suffice to use OpenMP's parallel for.
Misc
- use
++counter
instead ofcounter += 1;
for incrementing (likely to be optimized by compiler)
-
\$\begingroup\$ Thanks for all your help. In particular, using a
switch
in the way you suggested makes thegetColor
function quite a lot neater! \$\endgroup\$laurencevs– laurencevs2016年03月31日 14:32:30 +00:00Commented Mar 31, 2016 at 14:32 -
\$\begingroup\$ @monopole: I would consider this a hack instead of idiomatic and it was intended as a speed optimization due to reduced branching. \$\endgroup\$Nobody moving away from SE– Nobody moving away from SE2016年03月31日 15:02:07 +00:00Commented Mar 31, 2016 at 15:02
-
\$\begingroup\$ Don't need to use
OpenMP
any more . C++ comes with its own parallel library. Just useasync()
\$\endgroup\$Loki Astari– Loki Astari2016年03月31日 16:01:06 +00:00Commented Mar 31, 2016 at 16:01 -
\$\begingroup\$ You should probably use
pow(x,2)
(rather than 2.0) see if the compiler optimizes that. \$\endgroup\$Loki Astari– Loki Astari2016年03月31日 16:02:38 +00:00Commented Mar 31, 2016 at 16:02 -
\$\begingroup\$ @LokiAstari: While I agree that there are possibilities in standard C++ that allow for parallelization, an OpenMP parallel for would be far easier to write in this case. \$\endgroup\$Nobody moving away from SE– Nobody moving away from SE2016年03月31日 16:06:47 +00:00Commented Mar 31, 2016 at 16:06
I have to start with saying that this is a really cool and well made project! If you would like to improve the performance of your code, I recommend offloading work to GPU shaders. mandelbrot
easily fits into the description of a fragment shader. You can use OpenGL through SFML to achieve this, see this and this.
Naming, minor but useful
int mandelbrot
- this function returns number of iterations, name mandelbrot
does not tell anything about its purpose.
int counter = 0;
- this variable holds the number of iteration. Name numIterations
would be better because it explains much more
I see you already have got many answers about good code practices like naming conventions and other good things. Here's some optimization help for computational algorithms for modern CPUs in general.
- Replace conditionals (if, while, switch-case) inside loops with arithmetics whenever it is possible. This is important to not risk destroy instruction pipelining in modern processors.
Examples in code: if's and whiles inside mandelbrot and getColor functions.
- Optimization settings for your compiler. Allow it to optimize, try and unroll loops if it can, help it pick good assembler instructions by letting it know what CPU you have. Allow it to inline function calls if the function calls are in loops. Maybe use fast instructions and lower precision (for example float instead of double) if reduced precision is OK. I suspect you can use fast instructions without any problems on this particular program.
Examples in code: allowing the compiler to do inlining can avoid branching on the calls to the mandelbrot and getColor functions (which are inside the loops).
- Memory planning and structuring. Plan the data structures and calculations to utilize the RAM caches as much as you can. Difference between missing and hitting caches can easily affect performance tens to hundreds of times.