I would like to use C to implement the following formula:
$$\mathbf S(u,v)=\sum_{r=i-p}^{i}\sum_{s=j-q}^{j}N_{r,p}(u)N_{s,q}(v)\mathbf P_{r,s}$$
Namely,
$$ \left(N_{i-p,p}(u),\cdots,N_{i,p}(u)\right)_{1\times(p+1)} \begin{pmatrix} \mathbf{P}_{i-p,j-q} & \cdots & \mathbf{P}_{i-p,j}\\ \vdots& \cdots &\vdots\\ \mathbf{P}_{i,j-q} & \cdots & \mathbf{P}_{i,j}\\ \end{pmatrix} \begin{pmatrix} N_{j-q,q}(v)\\ \vdots\\ N_{j-q,q}(v)\\ \end{pmatrix}_{(q+1)\times 1} $$
where,
$$\mathbf P_{i,j}=\{x_{i,j},y_{i,j},z_{i,j}\}$$ or
$$\mathbf P_{i,j}=\{x_{i,j},y_{i,j},z_{i,j}, w_{i,j}\}$$
So the middle matrix is actually a tensor with rank 3, like this:
{{{1,1,0.09986},{1,2,0.07914},{1,3,0.80686},{1,4,0.68846},{1,5,0.297402}},
{{2,1,0.12262},{2,2,0.41283},{2,3,0.38565},{2,4,0.05670},{2,5,-0.12276}},
{{3,1,0.08301},{3,2,0.13181},{3,3,0.36565},{3,4,0.74432},{3,5,-0.62065}},
{{4,1,0.12755},{4,2,0.06099},{4,3,0.74465},{4,4,0.18402},{4,5,0.336987}},
{{5,1,0.12346},{5,2,0.97057},{5,3,0.72663},{5,4,0.23481},{5,5,0.968757}}}
Here is my algorithm and implementation with C code:
C-like algorithm
void calc_bspline_surf(double *P, double *U, double *V, int p, int q, double u, double v, int const *dims, double *S) {
double *Nu = (double *) malloc((p + 1)*sizeof(double));
double *Nv = (double *) malloc((q + 1)*sizeof(double));
double *temp = (double *) malloc(dims[2]*sizeof(double));
int m, n, c;
int i, j;
int k, l, r;
int uidx, vidx, idx;
m = dims[0];
n = dims[1];
c = dims[2];
i = find_span_index(p, U, m - 1, u);
j = find_span_index(q, V, n - 1, v);
b_spline_basis(p, U, i, u, Nu);
b_spline_basis(q, V, j, v, Nv);
/*main implementation*/
uidx = i - p;
for (l = 0; l <= q; l++){
for (r = 0; r <= c - 1; r++){
temp[r] = 0.0;
}
vidx = j - q + l;
for (k = 0; k <= p; k++){
idx = c*n*(uidx + k) + c*vidx;
for (r = 0; r <= c - 1; r++){
temp[r] = temp[r] + Nu[k]*P[idx+r];
}
}
for (r = 0; r <= c - 1; r++){
S[r] = S[r] + Nv[l]*temp[r];
}
}
free(Nu);
free(Nv);
free(temp);
}
Notes:
- For a tensor
T
with dimensions{m, n, c}
, which was stored with a flattened vectorV
, we can refer toT[i][j][k]
withV[n*c*i + n*j + k - 1]
- In function
calc_bspline_surf()
, theNu
, andNv
stores the values of vectorsN(i-p,p),..., N(i,p)
andN(j-q,q),..., N(j,q)
, respectively. S[0],...,S[c-1]
store the results
The complete file could be download with this link.
Owing to I am a C newcomer, I think my code is not fast. I'm seeking a more efficient strategy for dealing with this problem.
-
\$\begingroup\$ Do you have a name for that matrix formula? Or some better way to characterize it other than "a matrix formula"? \$\endgroup\$syb0rg– syb0rg2016年07月12日 12:08:49 +00:00Commented Jul 12, 2016 at 12:08
-
\$\begingroup\$ @syb0rg Maybe call it tensor product? \$\endgroup\$xyz– xyz2016年07月12日 13:18:14 +00:00Commented Jul 12, 2016 at 13:18
-
\$\begingroup\$ What aspect of this code do you want reviewed? Or is it just a general review request? \$\endgroup\$chux– chux2016年07月14日 05:20:05 +00:00Commented Jul 14, 2016 at 5:20
-
1\$\begingroup\$ @chux In fact, I would like to know that : is there more efficient method to implement that tensor product? \$\endgroup\$xyz– xyz2016年07月14日 06:34:23 +00:00Commented Jul 14, 2016 at 6:34
-
2\$\begingroup\$ "efficient method" could be taken as execution performance (speed) - which is what I think you want. "efficient method" could also mean minimal data space, minimal source code, etc. \$\endgroup\$chux– chux2016年07月15日 19:41:32 +00:00Commented Jul 15, 2016 at 19:41
2 Answers 2
Readability & Maintainability
Right now your code is a bit hard to read. A way to greatly help this would be to use better variable names.
Put the variable declarations to separate lines and initialize them to some value at the same time. From Code Complete, 2d Edition, p. 759:
With statements on their own lines, the code reads from top to bottom, instead of top to bottom and left to right. When you’re looking for a specific line of code, your eye should be able to follow the left margin of the code. It shouldn’t have to dip into each and every line just because a single line might contain two statements.
You should declare your
for
loops as such:for(int i = 0; i < size; ++i)
Note that this was introduced in the C99 standard. There is no reason you should not be using this standard in your code.
Design
Group your variables in
struct
s to make them more maintainable, a good hint that you should be doing this is when you are passing in 9 parameters to your function.Return the output vector; it makes your function easier to use. You'll have to allocate the space in your method and free it elsewhere. This also eliminates the potential bug discussed below.
Right now you are using this loop to zero out
temp
(or at least part of it)for (r = 0; r <= c - 1; r++){ temp[r] = 0.0; }
The original algorithm doesn't implement this in a separate loop, and I don't recommend you do either. It's a loop in the code you don't need.
These lines:
temp[r] = temp[r] + Nu[k]*P[idx+r]; S[r] = S[r] + Nv[l]*temp[r];
Can be shortened to:
temp[r] += Nu[k]*P[idx+r]; S[r] += Nv[l]*temp[r];
Set those pointers you
free()
ed toNULL
right after. This prevents us from trying to access the data afterwards, which will result in undefined and usually catastrophic behavior.
Bug
Right now you are relying on the passed in values already stored in
S[r]
when you addNv[l]*temp[r]
to them. In the original NURBS A3.5 algorithm, they are 0'ed out right before their use. You could overwrite all these values to 0 with amemset()
in our function and continue as is. However, overwriting something the user passes when we could've started from scratch is usually not expected behavior.Create the array at the top of the function like you do with the other arrays, using
calloc()
.
-
\$\begingroup\$ Thanks a lot:) I will refactor it according to your suggestions. About the
S[r]
, I have initialized it to 0 with for-loop in previous function. \$\endgroup\$xyz– xyz2016年07月15日 15:23:25 +00:00Commented Jul 15, 2016 at 15:23 -
1\$\begingroup\$ @ShutaoTANG I know you did, I looked at your full implementation. It is still a bug IMO, and is very likely to result in unexpected results. \$\endgroup\$syb0rg– syb0rg2016年07月15日 15:25:21 +00:00Commented Jul 15, 2016 at 15:25
-
\$\begingroup\$ @ShutaoTANG Don't forget to up-vote and (if you deem) accept! ;) \$\endgroup\$syb0rg– syb0rg2016年07月15日 15:30:37 +00:00Commented Jul 15, 2016 at 15:30
-
\$\begingroup\$ Using
calloc()
is practical for assigning0.0
. C does not require a bit pattern of zero interpreted as a FP to be 0.0 - so pedantically,for (r = 0; r <= c - 1; r++){ temp[r] = 0.0; }
is more portable. \$\endgroup\$chux– chux2016年07月15日 19:38:54 +00:00Commented Jul 15, 2016 at 19:38 -
1\$\begingroup\$ @chux According to the original algorithm that the OP is using, he shouldn't have a separate loop for it. I misunderstood his intentions initially, I'll make an edit quick. \$\endgroup\$syb0rg– syb0rg2016年07月15日 19:50:03 +00:00Commented Jul 15, 2016 at 19:50
To improve execution performance, focus on the inner loop
void calc_bspline_surf(double *P, double *U, double *V,
int p, int q, double u, double v, int const *dims, double *S) {
...
vidx = j - q + l;
for (k = 0; k <= p; k++){
idx = c*n*(uidx + k) + c*vidx;
for (r = 0; r <= c - 1; r++){
temp[r] = temp[r] + Nu[k]*P[idx+r];
}
}
Help the compiler use its optimizations. By using const
, the compiler can know referenced data should/not will not be changed and optimize accordingly.
void calc_bspline_surf(const double *P, const double *U, const double *V,
int p, int q, double u, double v, int const *dims, double *S) {
By using restrict
, which effectively says the referenced data will not change by using the other pointers, again, more optimizations can occur.
void calc_bspline_surf(
const double * restrict P, const double * restrict U, const double * restrict V,
int p, int q, double u, double v,
int const *restrict dims, double *restrict S) {
For a minor help, the inner loop may benefit a bit with re-order code. Profile different ways.
// Choose 1:
// simplify end-of-loop test
for (r = c-1; r >= 0; r--){
// or
for (r = c; r-- > 0; ){
// or
for (r = 0; r < c; r++){
// or original
for (r = 0; r <= c - 1; r++){
// Choose 1:
// This _may_ make for better code.
temp[r] += Nu[k]*P[idx+r];
// A good compile will generate the same code with this source
temp[r] = temp[r] + Nu[k]*P[idx+r];
}
-
\$\begingroup\$ Hm, I haven't really seen much use of
restrict
in the wild. Good thing to note for future reviews, +1! \$\endgroup\$syb0rg– syb0rg2016年07月15日 21:47:55 +00:00Commented Jul 15, 2016 at 21:47