I wrote a function to normalize sparse matrix along rows using C++ Eigen; however, I feel that it can be improved, and that I am not using some built in functions of Eigen.
#include <iostream>
#include <Eigen/Dense>
#include <Eigen/Sparse>
#include <vector>
#include <math.h>
using namespace Eigen;
std::vector<double> get_weights(SparseMatrix<double, RowMajor> sm);
SparseMatrix<double, RowMajor> row_normalize(SparseMatrix<double, RowMajor> sm);
int main()
{
MatrixXd silly(6, 3);
silly << 0, 1, 36,
0, 3, 0,
0, 0, 0,
3, 2, 1,
0, 1, 0,
2, 0, 9;
SparseMatrix<double, RowMajor> sparse_silly = silly.sparseView();
SparseMatrix<double, RowMajor> normalized = row_normalize(sparse_silly);
std::cout << "normalized" << std::endl;
std::cout << normalized << std::endl;
SparseMatrix<double, RowMajor> test = normalized * normalized.transpose();
// Checking that all diagonal elements of non-zero rows are 1
std::cout << "test" << std::endl;
std::cout << test << std::endl;
return 0;
}
SparseMatrix<double, RowMajor> row_normalize(SparseMatrix<double, RowMajor> sm)
{
std::vector<double> weights = get_weights(sm);
SparseMatrix<double, RowMajor> sm_weights(sm.rows(), sm.rows());
for(int i = 0; i < weights.size(); ++i)
{
sm_weights.insert(i,i) = weights[i];
}
return sm_weights * sm;
}
std::vector<double> get_weights(SparseMatrix<double, RowMajor> sm)
{
std::vector<double> weights;
for (int i = 0; i < sm.rows(); ++i)
{
double my_sum = 0;
double *val_ptr = sm.row(i).valuePtr();
if(sm.row(i).nonZeros())
// avoid division by zero problem
{
for (int j = 0; j < sm.row(i).nonZeros(); ++j)
{
my_sum += (*val_ptr * (*val_ptr));
val_ptr++;
}
weights.push_back(1.0 / sqrt(my_sum));
}
else
{
weights.push_back(0);
}
}
return weights;
}
1 Answer 1
I'd put the
Eigen
headers below the STL headers. Also,<math.h>
should be<cmath>
, the latter being the C++ library incorporated into thestd
namespace.You can define
main()
below the other functions, allowing you to save a bit of room by eliminating the function prototypes.You don't need that flushing with
std::endl
. To get just a newline, output a"\n"
.main()
doesn't need an explicitreturn 0
. As reaching this point implies success, the compiler will do this return for you.You shouldn't use an
int
here:for(int i = 0; i < weights.size(); ++i) { sm_weights.insert(i,i) = weights[i]; }
If you had your compiler warnings up high (and they should be), you should've been told about a type-mismatch here. The type for
i
should instead bestd::size_type
(an unsigned type), which is the return type ofsize()
for each STL container class.For this
for
loop:for (int j = 0; j < sm.row(i).nonZeros(); ++j) { my_sum += (*val_ptr * (*val_ptr)); val_ptr++; }
val_ptr++
can still be put into the loop statement:for (int j = 0; j < sm.row(i).nonZeros(); ++j, val_ptr++) { my_sum += (*val_ptr * (*val_ptr)); }
I'm also unsure about all the raw pointer use here. Raw pointers in C++ should be kept to a minimum, so there may be a way to do this without using any of them.
You pass
sm
by value in both of your functions, however, it doesn't look like you're trying to allow the compiler to perform move semantics. But you're also using the argument in such a way that it's only being read for writing to other values. In such a case, you should passsm
byconst&
.