Background
I'm writing code that searches for the row number of the maximum value in a partial column of a matrix. The partial column corresponds with the non-zero values of an equivalent lower triangular matrix. That is, starting from the input row index idx
through the final row. (This is my own version of subroutine for Gaussian Elimination with Partial Pivoting, for those who are interested).
Basically, the behavior of MaxLocColumnWise
is the search of columnwise maximum location
in the submatrices of A. Passing by reference, MaxLocColumnWise
changes the value of k
.
For example, for the following matrix $$ A = \begin{bmatrix}1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} $$
- when
k=0
,MaxLocColumnWise(A,k)
convertsk
to the index of maximum element in[1,4,7]
, which is2
, which is the index of7
i.e. we get2
as a result of
int k = 0;
`MaxLocColumnWise(A,k)`;
std::cout << k <<"\n";
when
k=1
,MaxLocColumnWise(A,k)
convertsk
to the index of maximum element in[5,8]
, which is2
, which is the index of8
. (This case is highlighted in red here)when
k=2
,MaxLocColumnWise(A,k)
convertsk
to the index of maximum element in[9]
, which is2
, which is the index of9
Code
Interestingly it turns out that the following code takes quite long time.
void MaxLocColumnWise(Matrix A, int &idx){
int n = A.size();
int col = idx;
double currentAbsMax = abs(A[col][col]);
for (int i=(col+1); i<n; i++){
double currentVal = abs(A[i][col]);
if (currentVal > currentAbsMax){
currentAbsMax = currentVal;
idx = i;
}
}
}
I have implemented this subroutine to the following Gaussian elimination with partial pivoting
routine:
void GaussElimPartialPivot(Matrix& A, Vector& b){
int n = A.size();
for (int j=0; j<(n-1); j++){
int index = j;
MaxLocColumnWise(A, index);
SwapMatrixRows(A, j, index);
SwapVector(b, j, index);
// main loop
for (int i=(j+1); i<n; i++){
double m = A[i][j]/A[j][j];
b[i] -= m*b[j];
for(int k=j; k<n; k++){
A[i][k] -= m*A[j][k];
}
}
}
}
But when A gets large, the program gets slower exactly due to the subroutine MaxLocColumnWise
, which was verified from excluding each subroutine in the main code.
But I'm not sure exactly where in MaxLocColumnWise()
to blame. Any help will be appreciated.
(An excuse for the exclusion of the code : Matrix
is just from typedef std::vector<Vector> Matrix;
, and the Vector
is typedef std::vector<double> Vector;
)
1 Answer 1
I see a number of things that may help you improve your code.
Pass by const reference where practical
The first argument to MaxLocColumnWise
is a Matrix
but that causes the entire input matrix to be duplicated. Better would be to make it const Matrix &
because it is not modified and it doesn't need to be duplicated. This is very likely the crux of your code's performance problem. On my machine with a matrix of size 1000, that single change drops the execution time down from 3.1 seconds to 11 milliseconds.
Prefer return value over reference
Instead of modifying one of the passed parameters, it's often better to return
a value instead. So in this case, the function would be
int MaxLocColumnWise(const Matrix &A, int idx);
Check parameters before use
If idx
is a negative number or beyond the end of the Matrix
, your program will invoke undefined behavior and it could crash or worse. Better would be to verify the value is in a valid range before use.
Use appropriate data types
An index into an array is never negative, so instead of int
, I'd recommend using std::size_t
.
Provide complete code to reviewers
This is not so much a change to the code as a change in how you present it to other people. Without the full context of the code and an example of how to use it, it takes more effort for other people to understand your code. This affects not only code reviews, but also maintenance of the code in the future, by you or by others. One good way to address that is by the use of comments. Another good technique is to include test code showing how your code is intended to be used. I split your code into a header Gauss.h
and implementation file Gauss.cpp
and then created a test driver. These are the resulting files, after applying all of the suggestions above:
Gauss.h
#ifndef GAUSS_H
#define GAUSS_H
#include <vector>
typedef std::vector<double> Vector;
typedef std::vector<Vector> Matrix;
std::size_t MaxLocColumnWise(const Matrix &A, std::size_t idx);
void GaussElimPartialPivot(Matrix& A, Vector& b);
#endif // GAUSS_H
Gauss.cpp
#include "Gauss.h"
#include <cmath>
std::size_t MaxLocColumnWise(const Matrix &A, std::size_t idx){
auto maxIndex{idx};
const auto col{idx};
for (double maxValue{-1}; idx < A.size(); ++idx) {
auto currentVal{std::abs(A[idx][col])};
if (currentVal > maxValue){
maxValue = currentVal;
maxIndex = idx;
}
}
return maxIndex;
}
void GaussElimPartialPivot(Matrix& A, Vector& b) {
const std::size_t n{A.size()};
for (std::size_t j{1}; j < n; ++j) {
auto maxcol{MaxLocColumnWise(A, j)};
SwapMatrixRows(A, j, maxcol);
SwapVector(b, j, maxcol);
for (std::size_t i{j-1}; i < n; ++i) {
double m{A[i][j]/A[j][j]};
b[i] -= m*b[j];
for (auto k{j}; k < n; ++k){
A[i][k] -= m*A[j][k];
}
}
}
}
main.cpp
#include "Gauss.h"
#include <iostream>
#include <numeric>
int main() {
constexpr size_t n{1000};
Matrix m;
m.reserve(n);
int startval{1};
for (size_t i{0}; i < n; ++i) {
Vector v(n);
std::iota(v.begin(), v.end(), startval);
m.push_back(v);
startval += n;
}
for (int i=0; i < n; ++i) {
auto max{MaxLocColumnWise(m, i)};
if (max != n-1) {
std::cout << "Error:" << i << ", " << MaxLocColumnWise(m, i) << '\n';
}
}
}
-
1\$\begingroup\$ Using unsigned for indexes is a bad idea. OP has
for (int j=0; j<(n-1); j++)
in which switching to unsigned would introduce a bug (considern ==0
). Also, range checks can be left tostd::vector
here. \$\endgroup\$D Drmmr– D Drmmr2019年04月27日 18:51:43 +00:00Commented Apr 27, 2019 at 18:51 -
2\$\begingroup\$ In the rewrite, no range check is required exactly because the revised code is using a
std::size_t
instead of anint
, and the bug you suggest does not exist, so I think you may want to reconsider your statement. \$\endgroup\$Edward– Edward2019年04月27日 19:08:18 +00:00Commented Apr 27, 2019 at 19:08 -
1\$\begingroup\$ Please explain "the bug does not exist". Following your advice yields
std::vector<int> v; for (size_t i = 0, n = v.size(); i < n - 1; ++i) ++v[i];
. That's a bug! \$\endgroup\$D Drmmr– D Drmmr2019年04月27日 20:38:22 +00:00Commented Apr 27, 2019 at 20:38 -
2\$\begingroup\$ The code can't be used as is with the updated version because of the changed interface, so any calling code must also be revised. Revising the code can easily be done to avoid introducing a bug. I've shown one means by which that might be done in my answer. I hadn't posted it before because we don't actually know the interface for
SwapMatrixRows
orSwapVector
. \$\endgroup\$Edward– Edward2019年04月27日 21:30:31 +00:00Commented Apr 27, 2019 at 21:30 -
3\$\begingroup\$ I'd also note that the
size
operator for all STL containers typically returnsstd::size_t
and notint
. \$\endgroup\$Edward– Edward2019年04月27日 21:32:32 +00:00Commented Apr 27, 2019 at 21:32