2
\$\begingroup\$

Problem Statement

(Source: Leetcode Problem 4: Median of Two Sorted Arrays [Hard])
(Topics: [Array] [Binary Search] [Divide and Conquer])

Given two sorted arrays nums1 and nums2 of size m and n respectively, return the median of the two sorted arrays.

The overall time complexity should be O(log(m+n)).

I was able to solve this problem with the following code:

def findMedianSortedArrays(self, nums1, nums2):
 nums = nums1 + nums2
 nums.sort()
 if len(nums) % 2 == 0:
 return (nums[(len(nums) / 2) - 1] + nums[len(nums) / 2]) / 2.0
 else:
 return nums[len(nums) // 2] 

which has a time complexity of \$O((m+n)\log(m+n))\$ and space complexity of \$O(m+n)\$.

Admittedly while I didn't find the problem too difficult overall, I do want to still try to improve the time complexity of my code to at least \$O(\log(m+n))\$.

My question is:

What am I misunderstanding about the approach needed to make the code run in \$O(\log(m+n))\$ time?

Edit

Per toolic's suggestion, I have modified the code to run on newer versions of Python (of course also using snake_case to write function names for readability). I used this online Python editor to test my fixes:

def find_median_two_sorted_arrays(nums1: list, nums2: list):
 nums = nums1 + nums2
 nums.sort
 if len(nums) % 2 == 0:
 return (nums[(len(nums) // 2) - 1] + nums[len(nums) // 2]) / 2.0
 else:
 return nums[len(nums) // 2] 

Note: I would like to say that I know (after doing some research) that this can be done in \$O(\log(\min(n,m)))\$ time complexity with \$O(1)\$ space complexity. However, this is not what I want as this is admittedly very out of my reach with my understanding of Python.

asked Dec 6, 2024 at 16:20
\$\endgroup\$
3
  • 2
    \$\begingroup\$ A O(log(min(m, n))) approach has been mentioned here: codereview.stackexchange.com/a/202358/35991. \$\endgroup\$ Commented Dec 6, 2024 at 17:07
  • 2
    \$\begingroup\$ @TobySpeight The time-limit-exceeded tag states "You may use this tag instead of performance when an online judge rejects your solution to a programming-challenge due to time constraints". We explicitly allow code which doesn't scale. \$\endgroup\$ Commented Dec 9, 2024 at 1:29
  • \$\begingroup\$ The crucial point of the task is finding a median of two arrays, not a single array obtained by joining the two. \$\endgroup\$ Commented Dec 11, 2024 at 18:37

2 Answers 2

7
\$\begingroup\$

mentioned that \$O(\log \min(n,m))\$ would be really outside of what I can understand with my current knowledge.

Nah! It's really about the sizes \$n, m\$, rather than about the algorithm. If one is of comparable magnitude to the other, then \$O(\min(n, m))\$ is simply \$O(n)\$, a case that's easily dealt with. For the \$\min\$ to be interesting, one of these must hold:

  • \$n \ll m\$, or
  • \$n \gg m\$

That's saying that one of the input lists is essentially "empty" for purposes of Big-Oh analysis, and won't significantly contribute to the overall time.

Take e.g. the \$n \ll m\$ case. Since \$n\$ is so small, we could merge its entries into the larger list with negligible cost. Or just use a \$O(\log m+n))\$ solution, confident that that's really \$O(\log m)\$.


improve the time complexity of my code to at least \$O(\log m+n))\$.

Your code uses a linear time expression, nums1 + nums2. You need to be able to talk about "the midpoint of the combined arrays" without ever touching the majority of array elements. A datastructure like this will assist with that:

class SlicedList:
 """Models a slice of a list using start and end index, with zero copies."""
 def __init__(
 self, nums: list[float], start: int = 0, end: int | None = None
 ) -> None:
 self.nums = nums
 self.start = start
 self.end = len(nums) if end is None else end
 def slice(self, start: int, end: int | None = None) -> "SlicedList":
 length = self.end - self.start
 end_i: int = length if end is None else end
 assert 0 <= start <= end_i <= length
 return SlicedList(self.nums, self.start + start, self.start + end_i)
 def __getitem__(self, i: int) -> float:
 return self.nums[self.start + i]
 def __len__(self) -> int:
 return self.end - self.start

And then this will help with the binary searching.

def kth_idx(a: SlicedList, b: SlicedList, k: int) -> tuple[int, int]: # noqa PLR0911
 """Given two sorted lists of numbers, return (list_id, idx) in O(log n) time.
 We are concerned with the kth element of the merged lists a and b.
 list_id: 0 for a, 1 for b, according to which contains the kth sorted value
 idx: index of the kth element in either list a or b
 """
 assert 0 <= k < len(a) + len(b)
 if not a:
 return 1, k
 if not b:
 return 0, k
 if k == 0:
 return int(a[0] > b[0]), k
 # binary search
 ia, ib = len(a) // 2, len(b) // 2
 if ia + ib < k:
 if a[ia] > b[ib]:
 return kth_idx(a, b.slice(ib + 1), k - ib - 1)
 return kth_idx(a.slice(ia + 1), b, k - ia - 1)
 if a[ia] > b[ib]:
 return kth_idx(a.slice(0, ia), b, k)
 return kth_idx(a, b.slice(0, ib), k)
answered Dec 6, 2024 at 18:20
\$\endgroup\$
1
  • \$\begingroup\$ And nums.sort() is even worse than nums1+nums2. \$\endgroup\$ Commented Dec 12, 2024 at 23:54
4
\$\begingroup\$

The previous answer addresses your main concern about complexity. This will address one of the now-deleted comments on the question regarding functionality.

Portability

The function that was posted must be part of a class which was not posted. The code can't be run as-is due to the presence of the self keyword. I removed self and added an example function call:

def findMedianSortedArrays(nums1, nums2):
 nums = nums1 + nums2
 nums.sort
 if len(nums) % 2 == 0:
 return (nums[(len(nums) / 2) - 1] + nums[len(nums) / 2]) / 2.0
 else:
 return nums[len(nums) // 2] 
print(findMedianSortedArrays([6,4,5], [2,3,1]))

When I run the code on Python version 2.7, I get the expected answer:

3.5

However, when I run the code on Python version 3.7, I get an error:

TypeError: list indices must be integers or slices, not float

Since Python 2.7 is deprecated, I suggest modifying the code to run on newer 3.x versions as well.

Naming

The PEP 8 style guide recommends using snake_case for function names:

find_median_sorted_arrays
answered Dec 8, 2024 at 11:31
\$\endgroup\$
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.