Median of Two Sorted Arrays hard

Problem Statement

There are two sorted arrays nums1 and nums2 of size m and n respectively. Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

Example 1

nums1 = [1, 3]
nums2 = [2]
The median is 2.0

Example 2

nums1 = [1, 2]
nums2 = [3, 4]
The median is (2 + 3)/2 = 2.5

Steps

The most efficient approach to solve this problem involves a binary search strategy. We can reduce the problem to finding the kth smallest element in the merged sorted array, where k = (m + n + 1) / 2 for the lower median and k = (m + n + 2) / 2 for the upper median. The median is the average of these two values if (m+n) is even.

Here's a breakdown of the binary search approach:

  1. Partitioning: We partition the shorter array (let's assume it's nums1) into two parts: nums1[0...i-1] and nums1[i...m-1]. The goal is to find the optimal i such that the kth smallest element is found efficiently.

  2. Binary Search: We perform a binary search on i (0 to m). For a given i, we can determine j such that the number of elements in nums1[0...i-1] and nums2[0...j-1] combined is equal to k-1. (This ensures we have k elements in total when we consider the next element)

  3. Condition Check: We compare nums1[i-1] and nums2[j-1] (if i or j is 0 we consider -infinity).

    • If nums1[i-1] <= nums2[j-1], then the kth smallest element must be in nums1[i...m-1] or nums2[0...j-1]. We adjust our search space accordingly.
    • If nums1[i-1] > nums2[j-1], then the kth smallest element must be in nums1[0...i-1] or nums2[j...n-1]. We adjust our search space accordingly.
  4. Termination: The binary search terminates when we find the correct i and j such that nums1[i-1] <= nums2[j] and nums2[j-1] <= nums1[i] (handling edge cases of i or j being 0). In this case, max(nums1[i-1], nums2[j-1]) will be the kth smallest element.

Explanation

The binary search approach cleverly avoids explicitly merging the two arrays. By focusing on finding the kth smallest element using partitioning, it achieves the desired logarithmic time complexity. The key is efficiently determining how to adjust the search space based on the comparison between elements from the partitions of nums1 and nums2.

Code

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
    int m = nums1.size();
    int n = nums2.size();

    if (m > n) {
        swap(nums1, nums2);
        swap(m, n);
    }

    int k = (m + n + 1) / 2;
    int left = 0, right = m;

    while (left <= right) {
        int i = left + (right - left) / 2;
        int j = k - i;

        int nums1_left = (i == 0) ? INT_MIN : nums1[i - 1];
        int nums2_left = (j == 0) ? INT_MIN : nums2[j - 1];
        int nums1_right = (i == m) ? INT_MAX : nums1[i];
        int nums2_right = (j == n) ? INT_MAX : nums2[j];

        if (nums1_left <= nums2_right && nums2_left <= nums1_right) {
            if ((m + n) % 2 == 0) {
                return (double)(max(nums1_left, nums2_left) + min(nums1_right, nums2_right)) / 2;
            } else {
                return (double)max(nums1_left, nums2_left);
            }
        } else if (nums1_left > nums2_right) {
            right = i - 1;
        } else {
            left = i + 1;
        }
    }
    return 0; //Should never reach here.
}

int main() {
    vector<int> nums1 = {1, 3};
    vector<int> nums2 = {2};
    cout << findMedianSortedArrays(nums1, nums2) << endl; // Output: 2

    vector<int> nums3 = {1, 2};
    vector<int> nums4 = {3, 4};
    cout << findMedianSortedArrays(nums3, nums4) << endl; // Output: 2.5

    return 0;
}

Complexity

  • Time Complexity: O(log(min(m, n))), due to the binary search on the shorter array.
  • Space Complexity: O(1), as we are using only a constant amount of extra space.