K Closest Points to Origin medium

Problem Statement

Given an array of points where points[i] = [xi, yi] represents a point on the X-Y plane and an integer k, return the k closest points to the origin (0, 0).

The distance between two points on the X-Y plane is the Euclidean distance: √((x1 - x2)² + (y1 - y2)²).

You may return the answer in any order. The answer is guaranteed to be unique (except for the order that it is in).

Example 1

Input: points = [[1,3],[-2,2]], k = 1 Output: [[-2,2]] Explanation: The distance between (1, 3) and the origin is sqrt(10). The distance between (-2, 2) and the origin is sqrt(8). Since sqrt(8) < sqrt(10), (-2, 2) is closer to the origin. We only want the closest k = 1 points from the origin, so the output is just [[-2,2]].

Example 2

Input: points = [[3,3],[5,-1],[-2,4]], k = 2 Output: [[3,3],[-2,4]] Explanation: The distance between (3, 3) and the origin is sqrt(18). The distance between (5, -1) and the origin is sqrt(26). The distance between (-2, 4) and the origin is sqrt(20). The two closest points are [3,3] and [-2,4].

Steps

  1. Calculate Distances: Compute the squared Euclidean distance of each point from the origin. We use squared distance to avoid the computationally expensive square root operation. The relative order of distances remains the same.

  2. Sort (or Use a Priority Queue): Sort the points based on their squared distances in ascending order. Alternatively, use a min-heap (priority queue) to efficiently keep track of the k closest points.

  3. Return k Closest Points: Select the first k points from the sorted array (or the top k elements from the priority queue).

Explanation

The solution uses a min-heap (priority queue) to maintain the k closest points encountered so far. This is more efficient than sorting the entire array, especially for large datasets. The heap ensures that we always have the k closest points in the heap, discarding further points if they are farther than the current furthest point in the heap.

The std::priority_queue in C++ provides a max-heap by default. We use a custom comparator to create a min-heap based on the squared distance.

Code

#include <iostream>
#include <vector>
#include <queue>

using namespace std;

struct Point {
    int x, y;
    long long distSq; // Using long long to avoid integer overflow
    Point(int x, int y) : x(x), y(y), distSq((long long)x * x + (long long)y * y) {}
};

struct comparePoints {
    bool operator()(const Point& a, const Point& b) {
        return a.distSq > b.distSq; // Min-heap based on squared distance
    }
};

vector<vector<int>> kClosest(vector<vector<int>>& points, int k) {
    priority_queue<Point, vector<Point>, comparePoints> minHeap;

    for (const auto& point : points) {
        Point p(point[0], point[1]);
        minHeap.push(p);
        if (minHeap.size() > k) {
            minHeap.pop();
        }
    }

    vector<vector<int>> result;
    while (!minHeap.empty()) {
        Point p = minHeap.top();
        minHeap.pop();
        result.push_back({p.x, p.y});
    }
    reverse(result.begin(), result.end()); // Reverse to get the order from closest to furthest
    return result;
}

int main() {
    vector<vector<int>> points = {{3,3},{5,-1},{-2,4}};
    int k = 2;
    vector<vector<int>> closestPoints = kClosest(points, k);

    for (const auto& point : closestPoints) {
        cout << "[" << point[0] << ", " << point[1] << "] ";
    }
    cout << endl;
    return 0;
}

Complexity

  • Time Complexity: O(N log k), where N is the number of points. The heap operations (push and pop) take O(log k) time each.
  • Space Complexity: O(k), the space used by the priority queue.