k th smallest element in row wise and column wise sorted matrix

Question 1

Find the kth smallest element in row wise and column wise sorted element:

Example:

matrix[row][col] < matrix[row+1][col]

matrix[row][col] < matrix[row][col+1]

mat=

1 2 21

11 23 25

31 36 41

I have written following code, How to make more readable and improve its complexity:

import heapq
def kth_smallest(k, mat):
 m = len(mat)
 n = len(mat[0])
 hp = []
 for i in xrange(n):
 heapq.heappush(hp,[mat[0][i], 0, i])
 heapq.heapify(hp)
 while k >1:
 x = heapq.heappop(hp)
 if x[1]+1 < m:
 heapq.heappush(hp,[mat[x[1]+1][x[2]], x[1]+1, x[2]]) 
 heapq.heapify(hp)
 k-=1
 return hp[0][0]
print kth_smallest(8, [[1,10,12],[2,11,21], [3,31,45]])

Question 2

In the problem statement duplicates are not mentioned. What is the 3rd smallest element in [[0, 1], [1, 2]], 1 or 2? All our functions would say 1 (yours included).

Question 3

heappush and heappop maintain the heap invariant; there is no need to call heapify after them. Alternatively, you could initialize the heap from an unsorted list using heapify and skip the heappush, but as the list is already sorted in this case, it is already a heap and you don't need to apply either function.
You push the whole first row to the heap but the first k elements would suffice thanks to the rows being already sorted.

Instead of

for i in xrange(n):
 heapq.heappush(hp,[mat[0][i], 0, i])

it would be more idiomatic to use

for i, value in enumerate(mat[0]):
 heapq.heappush(hp, [value, 0, i])

though as I already advised to take only k elements, you could instead use

for i, value in zip(xrange(k), mat[0]):

More idiomatic instead of while k >1: ... k-=1 would be for _ in xrange(k - 1):
Readability of [mat[x[1]+1][x[2]], x[1]+1, x[2]]

could be improved to [mat[row + 1][col], row + 1, col]

by unpacking value, row, col = heapq.heappop(hp)

Revised code:

def kth_smallest(k, mat):
 '''Return the kth smallest element in row wise and column wise sorted matrix.
 mat represents the matrix as list of lists.
 '''
 m = len(mat)
 hp = [(value, 0, col) for col, value in zip(xrange(k), mat[0])]
 # hp is a min heap because mat[0] is assumed to be sorted
 for _ in xrange(k - 1):
 value, row, col = heapq.heappop(hp)
 row += 1
 if row < m:
 heapq.heappush(hp, (mat[row][col], row, col)) 
 return hp[0][0]

Question 4

I added your code to the timing table in my answer. Obviously your solution scales a lot better.

Question 5

A very easy way is to hope that list additions and the standard Python sort are fast (the latter being especially fast for partially sorted arrays).

In that case you could have used the second argument of the sum function as a starting point for the sum and used:

def kth_smallest(k, mat):
 entries = sorted(sum(mat, []))
 return entries[k - 1]

On my machine, for the example given, this outperforms your solution for small matrices, but breaks down for larger matrices. It is, however a lot easier to understand.

+----+---------------------------------------+----------+---------+--------------+
| k | matrix | Graipher | Harsha | Janne Karila |
+----+---------------------------------------+----------+---------+--------------+
| 1 | [[1,10,12],[2,11,21], [3,31,45]] | 1.34 μs | 1.61 μs | 1.07 μs |
| 8 | [[1,10,12],[2,11,21], [3,31,45]] | 1.22 μs | 5.94 μs | 3.72 μs |
| 8 | [[1,10,12, 25, 38, 42, 51], | 2.43 μs | 12 μs | 5.55 μs |
| | [2,11,21, 35, 48, 52, 67], | | | |
| | [3,31,45, 47, 58, 63, 72], | | | |
| | [4, 32, 46, 48, 59, 64, 73]] | | | |
| 25 | [range(i, i+50) for i in range(50)] | 286 μs | 220 μs | 42.4 μs |
| 25 | [range(i, i+100) for i in range(100)] | 1.8 ms | 455 μs | 79.3 μs |
+----+---------------------------------------+----------+---------+--------------+

Style-wise, there are only a few PEP8 violations, all missing whitespace around operators.

Question 6

Don't you think itertools.chain.from_iterable(mat) would be more idiomatic than sum(mat, [])?

Question 7

@MathiasEttinger You are probably right, will try it out tomorrow. But, I always wanted to find a use for the second argument of sum^^

Question 8

@MathiasEttinger Performance-wise it is basically the same, with sum being slightly faster (by about 0.2 µs).

Question 9

This will scale badly, try a bigger array

Question 10

This problem is already solved in the heapq module. If you read more carefully its documentation you'll find heapq.merge whose description boils down to "it's similar to sorted(itertools.chain(*iterables)) [pretty much as in @Graipher answer] but more efficient".

The last thing left is to extract out the \$k^{th}\$ value out of the returned generator. You can do it manually by using next k times or using itertools.islice to do it for you:

from heapq import merge
from itertools import islice
def kth_smallest(k, mat):
 sorted = merge(*mat)
 sorted_from_k = islice(sorted, k, None)
 try:
 return next(sorted_from_k)
 except StopIteration:
 raise IndexError(k)

In case of very large number of rows in the original matrix, you may want to limit their unpacking. You can try slicing:

def kth_smallest(k, mat):
 sorted = merge(*mat[:k])
 sorted_from_k = islice(sorted, k, None)
 try:
 return next(sorted_from_k)
 except StopIteration:
 raise IndexError(k)

Janne Karila Janne Karila 10.6k21 silver badges34 bronze badges · Answer 1 · 2017-04-19 06:09:57Z

heappush and heappop maintain the heap invariant; there is no need to call heapify after them. Alternatively, you could initialize the heap from an unsorted list using heapify and skip the heappush, but as the list is already sorted in this case, it is already a heap and you don't need to apply either function.
You push the whole first row to the heap but the first k elements would suffice thanks to the rows being already sorted.

Instead of

for i in xrange(n):
 heapq.heappush(hp,[mat[0][i], 0, i])

it would be more idiomatic to use

for i, value in enumerate(mat[0]):
 heapq.heappush(hp, [value, 0, i])

though as I already advised to take only k elements, you could instead use

for i, value in zip(xrange(k), mat[0]):

More idiomatic instead of while k >1: ... k-=1 would be for _ in xrange(k - 1):
Readability of [mat[x[1]+1][x[2]], x[1]+1, x[2]]

could be improved to [mat[row + 1][col], row + 1, col]

by unpacking value, row, col = heapq.heappop(hp)

Revised code:

def kth_smallest(k, mat):
 '''Return the kth smallest element in row wise and column wise sorted matrix.
 mat represents the matrix as list of lists.
 '''
 m = len(mat)
 hp = [(value, 0, col) for col, value in zip(xrange(k), mat[0])]
 # hp is a min heap because mat[0] is assumed to be sorted
 for _ in xrange(k - 1):
 value, row, col = heapq.heappop(hp)
 row += 1
 if row < m:
 heapq.heappush(hp, (mat[row][col], row, col)) 
 return hp[0][0]

I added your code to the timing table in my answer. Obviously your solution scales a lot better.

Graipher Graipher 41.6k7 gold badges70 silver badges134 bronze badges · Answer 2 · 2017-04-18 13:08:25Z

A very easy way is to hope that list additions and the standard Python sort are fast (the latter being especially fast for partially sorted arrays).

In that case you could have used the second argument of the sum function as a starting point for the sum and used:

def kth_smallest(k, mat):
 entries = sorted(sum(mat, []))
 return entries[k - 1]

On my machine, for the example given, this outperforms your solution for small matrices, but breaks down for larger matrices. It is, however a lot easier to understand.

+----+---------------------------------------+----------+---------+--------------+
| k | matrix | Graipher | Harsha | Janne Karila |
+----+---------------------------------------+----------+---------+--------------+
| 1 | [[1,10,12],[2,11,21], [3,31,45]] | 1.34 μs | 1.61 μs | 1.07 μs |
| 8 | [[1,10,12],[2,11,21], [3,31,45]] | 1.22 μs | 5.94 μs | 3.72 μs |
| 8 | [[1,10,12, 25, 38, 42, 51], | 2.43 μs | 12 μs | 5.55 μs |
| | [2,11,21, 35, 48, 52, 67], | | | |
| | [3,31,45, 47, 58, 63, 72], | | | |
| | [4, 32, 46, 48, 59, 64, 73]] | | | |
| 25 | [range(i, i+50) for i in range(50)] | 286 μs | 220 μs | 42.4 μs |
| 25 | [range(i, i+100) for i in range(100)] | 1.8 ms | 455 μs | 79.3 μs |
+----+---------------------------------------+----------+---------+--------------+

Style-wise, there are only a few PEP8 violations, all missing whitespace around operators.

Don't you think itertools.chain.from_iterable(mat) would be more idiomatic than sum(mat, [])?
@MathiasEttinger You are probably right, will try it out tomorrow. But, I always wanted to find a use for the second argument of sum^^
@MathiasEttinger Performance-wise it is basically the same, with sum being slightly faster (by about 0.2 µs).

score 1 · Answer 3 · 2017-04-19 19:24:25Z

This problem is already solved in the heapq module. If you read more carefully its documentation you'll find heapq.merge whose description boils down to "it's similar to sorted(itertools.chain(*iterables)) [pretty much as in @Graipher answer] but more efficient".

The last thing left is to extract out the \$k^{th}\$ value out of the returned generator. You can do it manually by using next k times or using itertools.islice to do it for you:

from heapq import merge
from itertools import islice
def kth_smallest(k, mat):
 sorted = merge(*mat)
 sorted_from_k = islice(sorted, k, None)
 try:
 return next(sorted_from_k)
 except StopIteration:
 raise IndexError(k)

In case of very large number of rows in the original matrix, you may want to limit their unpacking. You can try slicing:

def kth_smallest(k, mat):
 sorted = merge(*mat[:k])
 sorted_from_k = islice(sorted, k, None)
 try:
 return next(sorted_from_k)
 except StopIteration:
 raise IndexError(k)

Stack Exchange Network

k th smallest element in row wise and column wise sorted matrix

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

k th smallest element in row wise and column wise sorted matrix

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions