Find Median from Data Stream hard
Problem Statement
Design a data structure that supports the following two operations:
void addNum(int num)
: Add a integer number from the data stream to the data structure.double findMedian()
: Return the median of all elements so far in the data structure.
Example 1
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(1); // arr = [1]
medianFinder.addNum(2); // arr = [1, 2]
medianFinder.findMedian(); // return 1.5 (because the median is (1 + 2) / 2 = 1.5)
medianFinder.addNum(3); // arr = [1, 2, 3]
medianFinder.findMedian(); // return 2.0
Example 2
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(5); // arr = [5]
medianFinder.addNum(2); // arr = [5,2]
medianFinder.findMedian(); // return 3.5 (because the median is (2 + 5) / 2 = 3.5)
Steps
-
Data Structure: We will use two heaps: a max-heap
small
to store the smaller half of the numbers and a min-heaplarge
to store the larger half.small
will always have at most one more element thanlarge
if the total number of elements is odd. -
addNum(int num):
- If
small
is empty ornum
is less than or equal to the top ofsmall
, addnum
tosmall
. - Otherwise, add
num
tolarge
. - Balance the heaps: If the size difference between
small
andlarge
is greater than 1, move the top element from the larger heap to the smaller heap.
- If
-
findMedian():
- If the total number of elements is odd, the median is the top of
small
. - If the total number of elements is even, the median is the average of the tops of
small
andlarge
.
- If the total number of elements is odd, the median is the top of
Explanation
Using two heaps allows us to efficiently maintain the sorted order of the elements without explicitly sorting the entire array each time. The max-heap (small
) tracks the smaller half, and the min-heap (large
) tracks the larger half. By maintaining a balanced state between these two heaps, we can quickly find the median.
Code
#include <queue>
class MedianFinder {
public:
priority_queue<int> small; // max heap
priority_queue<int, vector<int>, greater<int>> large; // min heap
MedianFinder() {}
void addNum(int num) {
small.push(num);
large.push(small.top());
small.pop();
if (small.size() < large.size()) {
small.push(large.top());
large.pop();
}
}
double findMedian() {
if (small.size() > large.size()) {
return (double)small.top();
} else {
return (double)(small.top() + large.top()) / 2.0;
}
}
};
Complexity
-
Time Complexity:
addNum
: O(log n), where n is the number of elements added. Heap operations take logarithmic time.findMedian
: O(1). Accessing the top of a heap is constant time.
-
Space Complexity: O(n), as we store all the numbers in the heaps.