I'll include a solution in Python and C++ and you can review one. I'm mostly interested in reviewing the C++ code which is a thing I recently started learning; those who don't know C++ can review the Python code. Both solutions share similar logic, so the review will apply to any.
Problem statement
Given an m x n matrix, return all elements of the matrix in spiral order.
Example1:
Input: matrix = [[1,2,3],[4,5,6],[7,8,9]]
Output: [1,2,3,6,9,8,7,4,5]
Example2:
Input: matrix = [[1,2,3,4],[5,6,7,8],[9,10,11,12]]
Output: [1,2,3,4,8,12,11,10,9,5,6,7]
For some reason c++ execution time is 3x python's time 10 seconds and 28 seconds for python and c++ respectively, same as my recent post I'm expecting c++ code to be a lot faster, why is it taking so long with both implementations being almost similar as both use the same recursive algorithm.
spiral_matrix.py
from time import perf_counter
def calculate(matrix):
numbers = []
if not matrix or not matrix[0]:
return numbers
elif isinstance(matrix[0], int):
return numbers + matrix
else:
if len(matrix) == 1:
return numbers + matrix[0]
if len(matrix[0]) == 1:
return numbers + [item.pop() for item in matrix]
numbers += matrix[0][:-1]
numbers += (cols := [*zip(*matrix)])[-1][:-1]
numbers += matrix[-1][::-1][:-1]
numbers += cols[0][::-1][:-1]
if rest := matrix[1:-1]:
return numbers + calculate([item[1:-1] for item in rest])
return numbers
def time_spiral(n, matrix):
print(f'Calculating time for {n} runs ...')
t1 = perf_counter()
for _ in range(n):
calculate(matrix)
print(f'Time: {perf_counter() - t1} seconds')
if __name__ == '__main__':
mtx = [
[1, 2, 3, 4, 5, 6],
[7, 8, 9, 10, 11, 12],
[13, 14, 15, 16, 17, 18],
[19, 20, 21, 22, 23, 24],
[25, 26, 27, 28, 29, 30],
[31, 32, 33, 34, 35, 36],
]
print(calculate(mtx))
time_spiral(1000000, mtx)
Results:
[1, 2, 3, 4, 5, 6, 12, 18, 24, 30, 36, 35, 34, 33, 32, 31, 25, 19, 13, 7, 8, 9, 10, 11, 17, 23, 29, 28, 27, 26, 20, 14, 15, 16, 22, 21]
Calculating time for 1000000 runs ...
Time: 9.450265422000001 seconds
spiral_matrix.h
#ifndef LEETCODE_SPIRAL_MATRIX_H
#define LEETCODE_SPIRAL_MATRIX_H
#include <vector>
std::vector<int> get_spiral(const std::vector<std::vector<int>> &matrix);
#endif //LEETCODE_SPIRAL_MATRIX_H
spiral_matrix.cpp
#include "spiral_matrix.h"
#include <chrono>
#include <iostream>
static void
add_borders(size_t start_width, size_t start_height, std::vector<int> &numbers,
const std::vector<std::vector<int>> &matrix) {
for (size_t i{0}; i < start_width - 1; ++i) {
numbers.push_back(matrix[0][i]);
}
for (size_t i{0}; i < start_height - 1; ++i) {
numbers.push_back(matrix[i][start_width - 1]);
}
for (size_t i{start_width - 1}; i > 0; --i) {
numbers.push_back(matrix[start_height - 1][i]);
}
for (size_t i{start_height - 1}; i > 0; --i) {
numbers.push_back(matrix[i][0]);
}
}
static std::vector<std::vector<int>> get_rest(size_t start_width, size_t start_height,
const std::vector<std::vector<int>> &matrix) {
std::vector<std::vector<int>> rest;
for (size_t i{1}; i < start_height - 1; ++i) {
std::vector<int> row;
for (size_t j{1}; j < start_width - 1; ++j) {
row.push_back(matrix[i][j]);
}
rest.push_back(row);
}
return rest;
}
std::vector<int> get_spiral(const std::vector<std::vector<int>> &matrix) {
std::vector<int> numbers;
if (matrix.empty() || matrix[0].empty())
return numbers;
if (matrix.size() == 1)
return matrix[0];
if (matrix[0].size() == 1) {
for (auto i: matrix)
numbers.push_back(i[0]);
return numbers;
}
auto start_width = matrix[0].size();
auto start_height = matrix.size();
add_borders(start_width, start_height, numbers, matrix);
auto rest = get_rest(start_width, start_height, matrix);
auto next_result = get_spiral(rest);
numbers.insert(numbers.end(), next_result.begin(), next_result.end());
return numbers;
}
int main() {
std::vector<std::vector<int>> matrix{{1, 2, 3, 4, 5, 6},
{7, 8, 9, 10, 11, 12},
{13, 14, 15, 16, 17, 18},
{19, 20, 21, 22, 23, 24},
{25, 26, 27, 28, 29, 30},
{31, 32, 33, 34, 35, 36}};
auto result = get_spiral(matrix);
for (int num: result)
std::cout << num << ' ';
size_t n_times{1000000};
std::cout << "\nCalculating time for " << n_times << " runs ..." << '\n';
auto t1 = std::chrono::high_resolution_clock::now();
while (n_times > 0) {
get_spiral(matrix);
n_times--;
}
auto t2 = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::seconds>(
t2 - t1).count();
std::cout << duration << " seconds";
}
Results:
1 2 3 4 5 6 12 18 24 30 36 35 34 33 32 31 25 19 13 7 8 9 10 11 17 23 29 28 27 26 20 14 15 16 22 21
Calculating time for 1000000 runs ...
28 seconds
1 Answer 1
Just briefly:
The C++ version explicitly creates many new matrix objects and vectors. The python version is probably much more efficient behind the scenes.
To improve the performance of the C++ version (without changing the algorithm itself) we could:
Create a single
vector<int>
for the results, and pass it by reference toget_spiral
. We can add to it withpush_back
orinsert
.Instead of copying a subsection of the matrix, we could adjust and pass a simple bounds struct to the next recursive call:
struct bounds { std::size_t x, y, w, h; };
g++ spiral_matrix.cpp --std c++2a -o spiral_matrix
I then run the executable file and get the same results. How can I check what you're asking for? \$\endgroup\$g++ -O3 -ffast-math spiral_matrix.cpp --std c++2a -o spiral_matrix
for compiling \$\endgroup\$