Conditionally creating a new vertex

Question 1

I have a function which gets called very often, for this test 30,720,000 times. It has to run in real-time, so performance is very important. Are there possible improvements?

// This function looks up in the vertice table wheter a vertice at that position already exists
// - If yes, return the vertice index
// - If no, create a new vertice and return the created vertice index
//
// @param cfg: Pointer to a chunkConfig struct
// @param x: x position of the vertice
// @param y: y position of the vertice
inline VERTICE_INDEX GetOrCreateVertice(float x, float y, ChunkGenerator::chunkConfig *cfg) 
{
 int x_pos = int((x*POSARR_ADJ_F) + 0.5f);
 int y_pos = int((y*POSARR_ADJ_F) + 0.5f); 
 int dict_index = x_pos + (y_pos * cfg->subdivisions_adj);
 VERTICE_INDEX dict_entry = cfg->vertice_pool.vertice_dict[dict_index];
 VERTICE_INDEX current_index = cfg->vertice_pool.current_vertice_index;
 if (dict_entry >= 0 && dict_entry < 65535 && current_index > 0)
 return dict_entry;
 LVecBase3f* offset = cfg->base_position;
 LVecBase2f* dim = cfg->dimensions;
 LVecBase2f* tex_offset = cfg->texture_offset;
 LVecBase2f* tex_scale = cfg->texture_scale;
 int pool_index = ((int)current_index) * 5;
 float base_scale = 1.0 / (cfg->subdivisions_f-1.0);
 float x_scaled = x * base_scale;
 float y_scaled = y * base_scale;
 cfg->vertice_pool.vertice_array[pool_index+0] = offset->get_x() + (x_scaled * dim->get_x());
 cfg->vertice_pool.vertice_array[pool_index+1] = offset->get_y() + (y_scaled * dim->get_y());
 cfg->vertice_pool.vertice_array[pool_index+2] = offset->get_z();
 cfg->vertice_pool.vertice_array[pool_index+3] = tex_offset->get_x() + (x_scaled * tex_scale->get_x());
 cfg->vertice_pool.vertice_array[pool_index+4] = tex_offset->get_y() + (y_scaled * tex_scale->get_y());
 cfg->vertice_pool.vertice_dict[dict_index] = current_index;
 cfg->vertice_pool.current_vertice_index++;
 return current_index;
}

LVecBase3f and LVecBase2f are vector-types provided by the graphics-engine I use. VERTICE_INDEX is a unsigned short, POSARR_ADJ and POSARR_ADJ_F is constant 2.

This is the chunkConfig struct:

struct chunkConfig {
 int subdivisions;
 int subdivisions_adj;
 float subdivisions_f;
 LVecBase3f *base_position;
 LVecBase2f *dimensions;
 LVecBase2f *texture_offset;
 LVecBase2f *texture_scale;
 verticePool vertice_pool;
};
struct verticePool {
 VERTICE_INDEX current_vertice_index;
 VERTICE_INDEX current_primitive_index;
 VERTICE_INDEX * vertice_dict;
 float *vertice_array;
 VERTICE_INDEX *primitive_array;
};

Performance result measured by very-sleepy: link

Version based on the suggestions made in the comments: very-sleepy, AMD CodeAnalyst, and the assembler for the slow-line: generated assembler, I also made some arrays global, and renamed "vertice" to "vertex".

Question 2

Why did you document parameters that don't exist?

Question 3

What do those time figures mean? I seriously doubt the calculation of dict_index takes a whole second, assuming subdivisions and POSARR_ADJ are integers.

Question 4

Oh, I missed to remove them :) I updated the question. The 1.00s for example means the three lines above took 1.00s, that's because very-sleepy samples the current execution pointer. The time-result always means the time of the lines up to the last time-result.

Question 5

I have added more details :) Maybe that helps

Question 6

A maintainability rather than performance note: It might help if you used real words in your design. There is no such thing as a "vertice". The singular of "vertices" is "vertex".

Question 7

It looks to me like your function is actually doing 2 things:

It's checking to see if an entry exists in the dictionary, and if so return it
If it doesn't exist, then it creates a new one

So I would break it into 2 functions:

VERTICE_INDEX FindVerticeInDict(const float x, const float y, 
 const ChunkGenerator::chunkConfig& cfg, VERTICE_INDEX& current_index);
VERTICE_INDEX GetOrCreateVertice(const float x, const float y,
 ChunkGenerator::chunkConfig *cfg);

Note that I made x and y constants because they aren't changed by either function. Also, in the Find method, I made the cfg argument const because it's not being changed there either. It is modified in GetOrCreate though, so I left it as a pointer. (It could also be a non-const reference, if you prefer that style.)

I would create a named constant for the "entry not found" value:

const VERTICE_INDEX ENTRY_NOT_FOUND = 0xFFFF;

So it would look something like this:

VERTICE_INDEX FindVerticeInDict(const float x, const float y, 
 const ChunkGenerator::chunkConfig& cfg, VERTICE_INDEX& current_index) 
{
 int x_pos = int((x*POSARR_ADJ_F) + 0.5f);
 int y_pos = int((y*POSARR_ADJ_F) + 0.5f); 
 int dict_index = x_pos + (y_pos * cfg.subdivisions_adj);
 VERTICE_INDEX dict_entry = cfg.vertice_pool.vertice_dict[dict_index];
 current_index = cfg.vertice_pool.current_vertice_index;
 if (dict_entry >= 0 && dict_entry < ENTRY_NOT_FOUND && current_index > 0)
 return dict_entry;
 return ENTRY_NOT_FOUND;
}

For your vertex array, you really should create a struct that contains the members you're going to use rather than looking them up in a 1D array of floats. It's error prone, hard-to-read, and hard-to-maintain the way you're doing it now. I'd create a struct like this:

typedef struct Vertex {
 Point3D vertex_coord; // This might be the same as LVecBase3f?
 Point2D texture_coord; // This might be the same as LVecBase2f?
} Vertex;

Where Point3D is just:

typedef struct Point3D {
 float x;
 float y;
 float z;
} Point3D;

and Point2D just has an x and y float values. Doing this will allow you to remove the magical pool_index = ((int)current_index) * 5;, and have names for the values you're writing.

Also, you don't need to use the inline directive. The compiler can decide to ignore it, and it can decide to make something you didn't mark as inline inline. So there's not a lot of reason to use it.

You can reduce the verbosity of your code, and possibly reduce the number of pointer dereferences the compiler produces by getting the address of the vertex you want and assigning all its members through that one pointer.

Given the above, your function would now look like this:

VERTICE_INDEX GetOrCreateVertice(const float x, const float y, 
 ChunkGenerator::chunkConfig *cfg) 
{
 VERTICE_INDEX current_index = 0;
 VERTICE_INDEX dict_entry = FindVerticeInDict(x, y, *cfg, current_index);
 if (dict_entry != ENTRY_NOT_FOUND)
 return dict_entry;
 LVecBase3f* offset = cfg->base_position;
 LVecBase2f* dim = cfg->dimensions;
 LVecBase2f* tex_offset = cfg->texture_offset;
 LVecBase2f* tex_scale = cfg->texture_scale;
 float base_scale = 1.0 / (cfg->subdivisions_f-1.0);
 float x_scaled = x * base_scale;
 float y_scaled = y * base_scale;
 Vertex* vertex = &cfg->vertice_pool.vertice_arry [ current_index ];
 vertex->vertex_coord.x = offset->get_x() + (x_scaled * dim->get_x());
 vertex->vertex_coord.y = offset->get_y() + (y_scaled * dim->get_y());
 vertex->vertex_coord.z = offset->get_z();
 vertex->texture_coord.x = tex_offset->get_x() + (x_scaled * tex_scale->get_x());
 vertex->texture_coord.y = tex_offset->get_y() + (y_scaled * tex_scale->get_y());
 cfg->vertice_pool.vertice_dict[dict_index] = current_index;
 cfg->vertice_pool.current_vertice_index++;
 return current_index;
}

score 2 · Answer 1 · 2015-06-18 05:18:16Z

It looks to me like your function is actually doing 2 things:

It's checking to see if an entry exists in the dictionary, and if so return it
If it doesn't exist, then it creates a new one

So I would break it into 2 functions:

VERTICE_INDEX FindVerticeInDict(const float x, const float y, 
 const ChunkGenerator::chunkConfig& cfg, VERTICE_INDEX& current_index);
VERTICE_INDEX GetOrCreateVertice(const float x, const float y,
 ChunkGenerator::chunkConfig *cfg);

Note that I made x and y constants because they aren't changed by either function. Also, in the Find method, I made the cfg argument const because it's not being changed there either. It is modified in GetOrCreate though, so I left it as a pointer. (It could also be a non-const reference, if you prefer that style.)

I would create a named constant for the "entry not found" value:

const VERTICE_INDEX ENTRY_NOT_FOUND = 0xFFFF;

So it would look something like this:

VERTICE_INDEX FindVerticeInDict(const float x, const float y, 
 const ChunkGenerator::chunkConfig& cfg, VERTICE_INDEX& current_index) 
{
 int x_pos = int((x*POSARR_ADJ_F) + 0.5f);
 int y_pos = int((y*POSARR_ADJ_F) + 0.5f); 
 int dict_index = x_pos + (y_pos * cfg.subdivisions_adj);
 VERTICE_INDEX dict_entry = cfg.vertice_pool.vertice_dict[dict_index];
 current_index = cfg.vertice_pool.current_vertice_index;
 if (dict_entry >= 0 && dict_entry < ENTRY_NOT_FOUND && current_index > 0)
 return dict_entry;
 return ENTRY_NOT_FOUND;
}

For your vertex array, you really should create a struct that contains the members you're going to use rather than looking them up in a 1D array of floats. It's error prone, hard-to-read, and hard-to-maintain the way you're doing it now. I'd create a struct like this:

typedef struct Vertex {
 Point3D vertex_coord; // This might be the same as LVecBase3f?
 Point2D texture_coord; // This might be the same as LVecBase2f?
} Vertex;

Where Point3D is just:

typedef struct Point3D {
 float x;
 float y;
 float z;
} Point3D;

and Point2D just has an x and y float values. Doing this will allow you to remove the magical pool_index = ((int)current_index) * 5;, and have names for the values you're writing.

Also, you don't need to use the inline directive. The compiler can decide to ignore it, and it can decide to make something you didn't mark as inline inline. So there's not a lot of reason to use it.

You can reduce the verbosity of your code, and possibly reduce the number of pointer dereferences the compiler produces by getting the address of the vertex you want and assigning all its members through that one pointer.

Given the above, your function would now look like this:

VERTICE_INDEX GetOrCreateVertice(const float x, const float y, 
 ChunkGenerator::chunkConfig *cfg) 
{
 VERTICE_INDEX current_index = 0;
 VERTICE_INDEX dict_entry = FindVerticeInDict(x, y, *cfg, current_index);
 if (dict_entry != ENTRY_NOT_FOUND)
 return dict_entry;
 LVecBase3f* offset = cfg->base_position;
 LVecBase2f* dim = cfg->dimensions;
 LVecBase2f* tex_offset = cfg->texture_offset;
 LVecBase2f* tex_scale = cfg->texture_scale;
 float base_scale = 1.0 / (cfg->subdivisions_f-1.0);
 float x_scaled = x * base_scale;
 float y_scaled = y * base_scale;
 Vertex* vertex = &cfg->vertice_pool.vertice_arry [ current_index ];
 vertex->vertex_coord.x = offset->get_x() + (x_scaled * dim->get_x());
 vertex->vertex_coord.y = offset->get_y() + (y_scaled * dim->get_y());
 vertex->vertex_coord.z = offset->get_z();
 vertex->texture_coord.x = tex_offset->get_x() + (x_scaled * tex_scale->get_x());
 vertex->texture_coord.y = tex_offset->get_y() + (y_scaled * tex_scale->get_y());
 cfg->vertice_pool.vertice_dict[dict_index] = current_index;
 cfg->vertice_pool.current_vertice_index++;
 return current_index;
}

Stack Exchange Network

Conditionally creating a new vertex

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Conditionally creating a new vertex

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions