K Closest Points to Origin medium

Problem Statement

Given an array of points where points[i] = [xi, yi] represents a point on the X-Y plane and an integer k, return the k closest points to the origin (0, 0).

The distance between two points on the X-Y plane is the Euclidean distance: √((x1 - x2)² + (y1 - y2)²).

You may return the answer in any order. The answer is guaranteed to be unique (except for the order that it is in).

Example 1:

Input: points = [[1,3],[-2,2]], k = 1 Output: [[-2,2]] Explanation: The distance between (1, 3) and the origin is sqrt(10). The distance between (-2, 2) and the origin is sqrt(8). Since sqrt(8) < sqrt(10), (-2, 2) is closer to the origin. We only want the closest k = 1 points from the origin, so the output is [[-2,2]].

Example 2:

Input: points = [[3,3],[5,-1],[-2,4]], k = 2 Output: [[3,3],[-2,4]] Explanation: The distance between (3, 3) and the origin is sqrt(18). The distance between (5, -1) and the origin is sqrt(26). The distance between (-2, 4) and the origin is sqrt(20). The distances are sqrt(18), sqrt(26), sqrt(20) respectively. The two closest points are [3,3], [-2,4].

Steps:

  1. Calculate Distances: For each point, calculate its squared Euclidean distance from the origin. We use squared distance instead of the actual distance to avoid the computationally expensive square root operation. The relative order of distances remains the same.

  2. Sort: Sort the points based on their squared distances in ascending order. Python's sorted() function with a custom key is ideal for this.

  3. Return k Closest: Return the first k points from the sorted list.

Explanation:

The solution leverages the fact that comparing squared distances is equivalent to comparing distances. This optimization avoids repeatedly calculating square roots, significantly improving performance, especially for large datasets. The sorted() function efficiently sorts the points based on the squared distances, allowing us to easily select the k closest points.

Code:

import heapq

def kClosest(points, k):
    """
    Finds the k closest points to the origin.

    Args:
      points: A list of lists, where each inner list represents a point [x, y].
      k: The number of closest points to return.

    Returns:
      A list of lists representing the k closest points.
    """

    #Efficient approach using heapq (min-heap) for large datasets
    heap = []
    for x, y in points:
        dist_sq = x**2 + y**2
        heapq.heappush(heap, (-dist_sq, x, y)) #negate distance for min-heap to act as max-heap

        if len(heap) > k:
            heapq.heappop(heap)

    result = [[x,y] for _, x, y in heap]
    return result


    #Simple approach using sort (suitable for smaller datasets)
    # points.sort(key=lambda p: p[0]**2 + p[1]**2)
    # return points[:k]

Complexity:

  • Time Complexity: O(N log k) using heapq for large datasets where N is the number of points. O(N log N) using simple sort for smaller datasets.
  • Space Complexity: O(k) using heapq. O(1) using in-place sort. The space used by the result list is not considered in space complexity analysis because it's part of the output.