1
\$\begingroup\$

Link here

I'll include a solution in Python and C++ and you can review one. I'm mostly interested in reviewing the C++ code which is a thing I recently started learning; those who don't know C++ can review the Python code. Both solutions share similar logic, so the review will apply to any.


Problem statement

Given an m x n matrix, return all elements of the matrix in spiral order.

Example1:

spiral1

Input: matrix = [[1,2,3],[4,5,6],[7,8,9]]
Output: [1,2,3,6,9,8,7,4,5]

Example2:

spiral2

Input: matrix = [[1,2,3,4],[5,6,7,8],[9,10,11,12]]
Output: [1,2,3,4,8,12,11,10,9,5,6,7]

For some reason c++ execution time is 3x python's time 10 seconds and 28 seconds for python and c++ respectively, same as my recent post I'm expecting c++ code to be a lot faster, why is it taking so long with both implementations being almost similar as both use the same recursive algorithm.

spiral_matrix.py

from time import perf_counter
def calculate(matrix):
 numbers = []
 if not matrix or not matrix[0]:
 return numbers
 elif isinstance(matrix[0], int):
 return numbers + matrix
 else:
 if len(matrix) == 1:
 return numbers + matrix[0]
 if len(matrix[0]) == 1:
 return numbers + [item.pop() for item in matrix]
 numbers += matrix[0][:-1]
 numbers += (cols := [*zip(*matrix)])[-1][:-1]
 numbers += matrix[-1][::-1][:-1]
 numbers += cols[0][::-1][:-1]
 if rest := matrix[1:-1]:
 return numbers + calculate([item[1:-1] for item in rest])
 return numbers
def time_spiral(n, matrix):
 print(f'Calculating time for {n} runs ...')
 t1 = perf_counter()
 for _ in range(n):
 calculate(matrix)
 print(f'Time: {perf_counter() - t1} seconds')
if __name__ == '__main__':
 mtx = [
 [1, 2, 3, 4, 5, 6],
 [7, 8, 9, 10, 11, 12],
 [13, 14, 15, 16, 17, 18],
 [19, 20, 21, 22, 23, 24],
 [25, 26, 27, 28, 29, 30],
 [31, 32, 33, 34, 35, 36],
 ]
 print(calculate(mtx))
 time_spiral(1000000, mtx)

Results:

[1, 2, 3, 4, 5, 6, 12, 18, 24, 30, 36, 35, 34, 33, 32, 31, 25, 19, 13, 7, 8, 9, 10, 11, 17, 23, 29, 28, 27, 26, 20, 14, 15, 16, 22, 21]
Calculating time for 1000000 runs ...
Time: 9.450265422000001 seconds

spiral_matrix.h

#ifndef LEETCODE_SPIRAL_MATRIX_H
#define LEETCODE_SPIRAL_MATRIX_H
#include <vector>
std::vector<int> get_spiral(const std::vector<std::vector<int>> &matrix);
#endif //LEETCODE_SPIRAL_MATRIX_H

spiral_matrix.cpp

#include "spiral_matrix.h"
#include <chrono>
#include <iostream>
static void
add_borders(size_t start_width, size_t start_height, std::vector<int> &numbers,
 const std::vector<std::vector<int>> &matrix) {
 for (size_t i{0}; i < start_width - 1; ++i) {
 numbers.push_back(matrix[0][i]);
 }
 for (size_t i{0}; i < start_height - 1; ++i) {
 numbers.push_back(matrix[i][start_width - 1]);
 }
 for (size_t i{start_width - 1}; i > 0; --i) {
 numbers.push_back(matrix[start_height - 1][i]);
 }
 for (size_t i{start_height - 1}; i > 0; --i) {
 numbers.push_back(matrix[i][0]);
 }
}
static std::vector<std::vector<int>> get_rest(size_t start_width, size_t start_height,
const std::vector<std::vector<int>> &matrix) {
 std::vector<std::vector<int>> rest;
 for (size_t i{1}; i < start_height - 1; ++i) {
 std::vector<int> row;
 for (size_t j{1}; j < start_width - 1; ++j) {
 row.push_back(matrix[i][j]);
 }
 rest.push_back(row);
 }
 return rest;
}
std::vector<int> get_spiral(const std::vector<std::vector<int>> &matrix) {
 std::vector<int> numbers;
 if (matrix.empty() || matrix[0].empty())
 return numbers;
 if (matrix.size() == 1)
 return matrix[0];
 if (matrix[0].size() == 1) {
 for (auto i: matrix)
 numbers.push_back(i[0]);
 return numbers;
 }
 auto start_width = matrix[0].size();
 auto start_height = matrix.size();
 add_borders(start_width, start_height, numbers, matrix);
 auto rest = get_rest(start_width, start_height, matrix);
 auto next_result = get_spiral(rest);
 numbers.insert(numbers.end(), next_result.begin(), next_result.end());
 return numbers;
}
int main() {
 std::vector<std::vector<int>> matrix{{1, 2, 3, 4, 5, 6},
 {7, 8, 9, 10, 11, 12},
 {13, 14, 15, 16, 17, 18},
 {19, 20, 21, 22, 23, 24},
 {25, 26, 27, 28, 29, 30},
 {31, 32, 33, 34, 35, 36}};
 auto result = get_spiral(matrix);
 for (int num: result)
 std::cout << num << ' ';
 size_t n_times{1000000};
 std::cout << "\nCalculating time for " << n_times << " runs ..." << '\n';
 auto t1 = std::chrono::high_resolution_clock::now();
 while (n_times > 0) {
 get_spiral(matrix);
 n_times--;
 }
 auto t2 = std::chrono::high_resolution_clock::now();
 auto duration = std::chrono::duration_cast<std::chrono::seconds>(
 t2 - t1).count();
 std::cout << duration << " seconds";
}

Results:

1 2 3 4 5 6 12 18 24 30 36 35 34 33 32 31 25 19 13 7 8 9 10 11 17 23 29 28 27 26 20 14 15 16 22 21 
Calculating time for 1000000 runs ...
28 seconds
asked Nov 23, 2020 at 11:16
\$\endgroup\$
11
  • \$\begingroup\$ Which options are you using for compilation ? With -O3 and -ffast-math, I get the result in 839 ms ! I paid attention to use the intermediate output of the function, to be sure that the optimiser was not cancelling everything \$\endgroup\$ Commented Nov 23, 2020 at 17:36
  • \$\begingroup\$ @Damien I'm not sure of that because I run the code in the editor (clion) I also tried from the command line: g++ spiral_matrix.cpp --std c++2a -o spiral_matrix I then run the executable file and get the same results. How can I check what you're asking for? \$\endgroup\$ Commented Nov 23, 2020 at 17:41
  • \$\begingroup\$ Just use g++ -O3 -ffast-math spiral_matrix.cpp --std c++2a -o spiral_matrix for compiling \$\endgroup\$ Commented Nov 23, 2020 at 17:43
  • \$\begingroup\$ @Damien It's down from 24 seconds to 5 seconds, isn't this still slow? I'm running this on my i5 mbp. What are these flags? and how can I adjust editor settings accordingly? is there some kind of documentation to these things? \$\endgroup\$ Commented Nov 23, 2020 at 17:55
  • \$\begingroup\$ I have a i7. You can get details about these options at official gcc documention, for example, here. For compilation, I always use a command line, or a makefile (linux) \$\endgroup\$ Commented Nov 23, 2020 at 18:00

1 Answer 1

2
\$\begingroup\$

Just briefly:

The C++ version explicitly creates many new matrix objects and vectors. The python version is probably much more efficient behind the scenes.

To improve the performance of the C++ version (without changing the algorithm itself) we could:

  • Create a single vector<int> for the results, and pass it by reference to get_spiral. We can add to it with push_back or insert.

  • Instead of copying a subsection of the matrix, we could adjust and pass a simple bounds struct to the next recursive call:

    struct bounds { std::size_t x, y, w, h; };
    
answered Nov 23, 2020 at 16:41
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.