1
\$\begingroup\$

I'm trying to write something like an adjusted run-length encoding for row-wise bit matrices. Specifically I want to "collapse" runs of 1s into the number of 1s in that run while maintaining the same number of 0s as original (right-padding as necessary). For example:

input_v = np.array([
 [0, 1, 0, 0, 1, 1, 0, 1],
 [0, 0, 1, 0, 1, 1, 1, 0]
])
expected_v = np.array([
 [0, 1, 0, 0, 2, 0, 1],
 [0, 0, 1, 0, 3, 0, 0]
])

My current attempt works after padding, but is slow:

def count_neighboring_ones(l):
 o = 0
 for i in l:
 if i == 0:
 return o
 o += 1
def f(l):
 out = []
 i = 0
 while i < len(l):
 c = count_neighboring_ones(l[i:])
 out.append(c)
 i += (c or 1)
 return out

Are there some vectorization techniques I can use to operate on the entire matrix to reduce row-wise operations and post-padding?

asked Jul 27, 2021 at 16:30
\$\endgroup\$
7
  • \$\begingroup\$ Why are you doing this? What are the dimensions of typical data you hope to compress in this way? Can you not just use Numpy's built-in sparse matrix support? \$\endgroup\$ Commented Jul 27, 2021 at 18:54
  • \$\begingroup\$ I'm doing this because I need to pass the result to library code I don't control. It's normally small dims but it happens many times. \$\endgroup\$ Commented Jul 27, 2021 at 19:13
  • \$\begingroup\$ What is the library you're passing to? \$\endgroup\$ Commented Jul 27, 2021 at 19:14
  • \$\begingroup\$ I don't think it matters. \$\endgroup\$ Commented Jul 27, 2021 at 19:15
  • \$\begingroup\$ I think it does, if you want a review. But hey, it's your life \$\endgroup\$ Commented Jul 27, 2021 at 19:17

1 Answer 1

1
\$\begingroup\$

algorithm

We asked you nicely.

What are the dimensions of typical data you hope to compress in this way?

You declined to respond.

Compiled C code and interpreted python bytecode run at different speeds. When we use Big-Oh notation, we're hiding a constant factor. Some problems manipulate a dozen integers, and some a million integers. The details matter.

 while i < len(l):
 c = count_neighboring_ones(l[i:])

Let \$n\$ be the length of the input list. The while loop has \$O(n)\$ linear time complexity, and then the l[i:] slice is also linear. So we have \$O(n^2)\$ quadratic complexity. We are told that counting ones is important because there is some sparsity to the input data, so if \$z\$ is number of zeros we have no reason to believe that \$z \ll n\$.

I don't know how big your input list tends to be, because you adamantly refused to tell us. But I can tell you this: "quadratic bad".

Rather than passing in a freshly copied vector, you want to pass in an index that points within same old vector.

lint

Pep-8 asked you nicely to avoid certain names. zomg you went for both ell and oh. If this source code was longer I might be able to fill in more of my bingo squares. In the count helper you apparently intended this sort of thing: ones += 1.

You don't have to give your functions fancier names that f. But if you go the single-letter route, then cite your reference so we can read about what some author was thinking when defining the function, or give us a """docstring""" explaining what its single responsibility is.

answered Apr 28, 2024 at 23:42
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.