Find Median from Data Stream hard

Problem Statement

Design a data structure that supports the following two operations:

  • void addNum(int num): Add a integer number from the data stream to the data structure.
  • double findMedian(): Return the median of all elements so far.

For example:

addNum(1)
addNum(2)
findMedian() -> 1.5
addNum(3)
findMedian() -> 2

Example 1:

Input
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
Output
[null, null, null, 1.5, null, 2.0]

Explanation
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(1);    // arr = [1]
medianFinder.addNum(2);    // arr = [1, 2]
medianFinder.findMedian(); // return 1.5 (1 + 2) / 2 = 1.5
medianFinder.addNum(3);    // arr = [1, 2, 3]
medianFinder.findMedian(); // return 2.0

Example 2:

Input
["MedianFinder","addNum","addNum","findMedian","addNum","findMedian","addNum","findMedian","addNum","findMedian","addNum","findMedian"]
[[],[1],[2],[],[3],[],[4],[],[5],[],[6],[]]
Output
[null,null,null,1.5,null,2.0,null,2.5,null,3.0,null,3.5]

Steps and Explanation

This problem can be efficiently solved using two heaps: a min-heap and a max-heap.

  1. Data Structures: We'll use a max-heap (maxHeap) to store the smaller half of the numbers and a min-heap (minHeap) to store the larger half. The max-heap will always have either the same number of elements as the min-heap, or one more (in case of an odd number of elements).

  2. addNum(int num):

    • If maxHeap is empty or num is less than or equal to the root of maxHeap, add num to maxHeap.
    • Otherwise, add num to minHeap.
    • After adding the number, balance the heaps. If the size difference between the heaps is greater than 1, move the root of the larger heap to the smaller heap.
  3. findMedian():

    • If the sizes of both heaps are equal, the median is the average of their roots.
    • Otherwise, the median is the root of the larger heap.

Code (Java)

import java.util.PriorityQueue;

class MedianFinder {
    PriorityQueue<Integer> maxHeap; // Smaller half
    PriorityQueue<Integer> minHeap; // Larger half

    public MedianFinder() {
        maxHeap = new PriorityQueue<>((a, b) -> b - a); // Max heap (reverse order)
        minHeap = new PriorityQueue<>(); // Min heap
    }

    public void addNum(int num) {
        maxHeap.offer(num);
        minHeap.offer(maxHeap.poll());

        if (maxHeap.size() < minHeap.size()) {
            maxHeap.offer(minHeap.poll());
        }
    }

    public double findMedian() {
        if (maxHeap.size() == minHeap.size()) {
            return (double) (maxHeap.peek() + minHeap.peek()) / 2;
        } else {
            return (double) maxHeap.peek();
        }
    }
}

Complexity Analysis

  • Time Complexity:

    • addNum(): O(log n), due to heap operations.
    • findMedian(): O(1), as it only involves accessing the top elements of the heaps.
  • Space Complexity: O(n), where n is the number of elements added, to store the heaps.