Kth Largest Element in an Array medium

Problem Statement

Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.

Example 1:

Input: nums = [3,2,1,5,6,4], k = 2
Output: 5

Example 2:

Input: nums = [3,2,3,1,2,4,5,5,6], k = 4
Output: 4

Steps

There are several ways to solve this problem. We'll focus on two efficient approaches:

  1. QuickSelect (Average O(n) time complexity): This is a randomized algorithm based on the QuickSort partitioning scheme. It's highly efficient on average.

  2. Min-Heap (O(n log k) time complexity): We use a min-heap data structure to keep track of the k largest elements encountered so far.

Explanation:

QuickSelect:

The QuickSelect algorithm mimics the partitioning step of QuickSort. We choose a pivot element (randomly in this implementation for better average-case performance). We partition the array around the pivot, such that elements smaller than the pivot are on its left and elements larger are on its right.

  • If the pivot's position (after partitioning) is k-1, then the pivot is the kth largest element.
  • If the pivot's position is greater than k-1, the kth largest element is in the left subarray.
  • If the pivot's position is less than k-1, the kth largest element is in the right subarray.

We recursively apply this process until we find the kth largest element.

Min-Heap:

A min-heap maintains the k largest elements seen so far. Initially, the heap is empty. We iterate through the array:

  • If the heap size is less than k, we add the current element to the heap.
  • If the heap size is equal to k and the current element is greater than the smallest element in the heap (the root), we replace the root with the current element and heapify to maintain the min-heap property.

After processing all elements, the root of the heap contains the kth largest element.

Code (Java - QuickSelect):

import java.util.Random;

class Solution {
    public int findKthLargest(int[] nums, int k) {
        Random random = new Random();
        return quickSelect(nums, 0, nums.length - 1, nums.length - k, random);
    }

    private int quickSelect(int[] nums, int left, int right, int k, Random random) {
        int pivotIndex = left + random.nextInt(right - left + 1);
        int pivot = nums[pivotIndex];
        swap(nums, pivotIndex, right);

        int l = left;
        int r = right - 1;
        while (l <= r) {
            if (nums[l] < pivot) {
                l++;
            } else if (nums[r] >= pivot) {
                r--;
            } else {
                swap(nums, l, r);
                l++;
                r--;
            }
        }
        swap(nums, l, right);

        if (l == k) {
            return nums[l];
        } else if (l < k) {
            return quickSelect(nums, l + 1, right, k, random);
        } else {
            return quickSelect(nums, left, l - 1, k, random);
        }
    }

    private void swap(int[] nums, int i, int j) {
        int temp = nums[i];
        nums[i] = nums[j];
        nums[j] = temp;
    }
}

Code (Java - Min-Heap):

import java.util.PriorityQueue;

class Solution {
    public int findKthLargest(int[] nums, int k) {
        PriorityQueue<Integer> minHeap = new PriorityQueue<>();
        for (int num : nums) {
            minHeap.offer(num);
            if (minHeap.size() > k) {
                minHeap.poll();
            }
        }
        return minHeap.peek();
    }
}

Complexity

QuickSelect:

  • Time Complexity: Average: O(n), Worst Case: O(n^2) (though highly unlikely with random pivot selection)
  • Space Complexity: O(log n) (due to recursive calls in the average case; worst case O(n))

Min-Heap:

  • Time Complexity: O(n log k)
  • Space Complexity: O(k) (to store the min-heap)

The QuickSelect approach is generally preferred for its average linear time complexity, making it more efficient for larger arrays, especially when k is not significantly smaller than n. The Min-Heap approach is simpler to implement and offers guaranteed O(n log k) performance, making it a good choice when k is relatively small compared to n.