Kth Largest Element in an Array medium
Problem Statement
Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.
You can assume k is always valid, 1 ≤ k ≤ array's length.
Example 1:
Input: nums = [3,2,1,5,6,4], k = 2 Output: 5
Example 2:
Input: nums = [3,2,3,1,2,4,5,5,6], k = 4 Output: 4
Steps
-
Sorting: The most straightforward approach is to sort the array in descending order and then return the element at index
k-1
. This is simple but not the most efficient for large arrays. -
Heap (Priority Queue): A min-heap data structure can efficiently track the k largest elements seen so far. We iterate through the array, adding each element to the heap. If the heap size exceeds k, we remove the smallest element (root of the min-heap). After processing the entire array, the root of the heap will be the kth largest element.
-
QuickSelect (Partition-based approach): This is a randomized algorithm inspired by the QuickSort algorithm. It partitions the array around a pivot element. If the pivot's index is k-1, we've found the kth largest element. Otherwise, we recursively apply the algorithm to either the left or right partition. This approach has an average time complexity of O(n), but its worst-case complexity is O(n^2).
Explanation
We'll implement the heap-based solution because it offers a guaranteed O(n log k) time complexity, which is better than sorting (O(n log n)) when k is significantly smaller than n. The min-heap ensures that we only need to maintain the k largest elements seen so far in memory.
Code (Python)
import heapq
def findKthLargest(nums, k):
"""
Finds the kth largest element in an array using a min-heap.
Args:
nums: The input array of numbers.
k: The desired kth largest element.
Returns:
The kth largest element.
"""
return heapq.nlargest(k, nums)[-1]
#Alternative Implementation using a min-heap explicitly:
def findKthLargest_heap(nums, k):
min_heap = []
for num in nums:
heapq.heappush(min_heap, num) #Push each element into min heap
if len(min_heap) > k:
heapq.heappop(min_heap) #Remove the smallest if heap size exceeds k
return min_heap[0]
Complexity
-
Time Complexity: O(n log k) for the heap-based solution. Building the heap takes O(k log k) time, and iterating through the remaining n-k elements takes O((n-k) log k) time. The overall time complexity simplifies to O(n log k) because k <= n. Sorting would be O(n log n). QuickSelect's average case is O(n), but its worst case is O(n^2).
-
Space Complexity: O(k) for the heap-based solution because the heap stores at most k elements. Sorting might use O(log n) extra space depending on the sorting algorithm used.