283. K Closest Points to Origin

🔗 LeetCode Problem: 973. K Closest Points to Origin
📊 Difficulty: Medium
🏷️ Topics: Array, Math, Divide and Conquer, Geometry, Sorting, Heap (Priority Queue), Quickselect

Problem Statement

Given an array of points where points[i] = [xi, yi] represents a point on the X-Y plane and an integer k, return the k closest points to the origin (0, 0).

The distance between two points on the X-Y plane is the Euclidean distance (i.e., √((x1 - x2)² + (y1 - y2)²)).

You may return the answer in any order. The answer is guaranteed to be unique (except for the order that it is in).

Example 1:

Input: points = [[1,3],[-2,2]], k = 1
Output: [[-2,2]]
Explanation:
The distance between (1, 3) and the origin is sqrt(10).
The distance between (-2, 2) and the origin is sqrt(8).
Since sqrt(8) < sqrt(10), (-2, 2) is closer to the origin.
We only want the closest k = 1 points from the origin, so the answer is just [[-2,2]].

Example 2:

Input: points = [[3,3],[5,-1],[-2,4]], k = 2
Output: [[3,3],[-2,4]]
Explanation: The answer [[-2,4],[3,3]] would also be accepted.

Constraints: - 1 <= k <= points.length <= 10^4 - -10^4 <= xi, yi <= 10^4

🌟 Understanding the Problem

CRITICAL: Distance Formula Clarification

Question: Why is the formula x² + y² and not (x₂-x₁)² + (y₂-y₁)²?

GENERAL distance formula between TWO arbitrary points:
  Point 1: (x₁, y₁)
  Point 2: (x₂, y₂)

  Distance = √((x₂-x₁)² + (y₂-y₁)²)

But in THIS problem, one point is ALWAYS the origin (0, 0)!

Distance from point (x, y) to ORIGIN (0, 0):

  Applying the general formula:
    Distance = √((x-0)² + (y-0)²)
             = √(x² + y²)

  Why the simplification?
    (x-0) = x, so (x-0)² = x²
    (y-0) = y, so (y-0)² = y²

  The origin coordinates (0, 0) subtract to nothing!

Examples demonstrating the simplification:

  Point (3, 4) to origin (0, 0):
    Full formula: √((3-0)² + (4-0)²) = √(9 + 16) = √25 = 5
    Simplified: √(3² + 4²) = √25 = 5 ✓

  Point (-2, 2) to origin (0, 0):
    Full formula: √((-2-0)² + (2-0)²) = √(4 + 4) = √8 ≈ 2.83
    Simplified: √((-2)² + 2²) = √8 ≈ 2.83 ✓

  Point (5, 0) to origin (0, 0):
    Simplified: √(5² + 0²) = √25 = 5 ✓

In code:
  int dist = x * x + y * y;

  This IS the formula (x-0)² + (y-0)²
  The -0 disappears, leaving just x² + y²!

Why we can skip sqrt:

For COMPARISON only, we don't need the actual distance value!

Mathematical property (sqrt is monotonic increasing):
  √a < √b ⟺ a < b

So comparing distances:
  √8 < √10 is equivalent to 8 < 10

We can compare SQUARED distances directly:
  x₁² + y₁² vs x₂² + y₂²

Example:
  Point A (1, 3): dist² = 1² + 3² = 1 + 9 = 10
  Point B (-2, 2): dist² = (-2)² + 2² = 4 + 4 = 8

  Compare squared: 8 < 10
  Therefore: B is closer! ✓

  No need to compute √8 ≈ 2.83 and √10 ≈ 3.16!

Benefits:
  ✓ Faster computation (no expensive sqrt operation)
  ✓ Exact comparison (no floating point errors)
  ✓ Simpler code

Key Observations:

1. Distance to origin SIMPLIFIES
   General: √((x-x₀)² + (y-y₀)²)
   To origin: √(x² + y²)  [because origin is (0,0)]

2. Can skip sqrt for comparison
   Compare x² + y² directly
   Saves computation time!

3. k closest = k smallest distances
   Similar to "Kth Largest Element" pattern!

4. Order doesn't matter in result
   [[-2,2], [1,0]] and [[1,0], [-2,2]] both valid

🌟 The Natural Thinking Process

When you first see this problem:

Question: Find k closest points to origin

First Thought: "Calculate distances, sort, take first k"
  - Compute distance for each point  
  - Sort by distance
  - Return first k points

Is this optimal? NO (sorts all!), but it WORKS!
Let's code this first to understand.

The Evolution:

BRUTE FORCE THINKING:
  "Calculate distances → Sort all → Take k"
  Time: O(n log n) for sorting
  Problem: Sorting ALL when only need k!
  ⬇

REALIZATION 1:
  "I don't need ALL sorted, just k smallest!"
  "This is exactly like Problem 278 (Top K Frequent)!"
  "And Problem 281 (Kth Largest in Stream)!"
  "Use heap of size k!"
  ⬇

OPTIMIZATION 1: Heap of Size K
  "Keep heap of k closest"
  "But... min-heap or max-heap?"
  Time: O(n log k) - better when k << n!
  ⬇

CRITICAL QUESTION:
  "Closest = smallest distance"
  "So min-heap for smallest, right?"
  "NO! Need max-heap! Let me understand why..."
  ⬇

OPTIMIZATION 2: Max-Heap of Size K
  "Max-heap tracks k smallest distances!"
  "Top = largest of k smallest = boundary!"
  "Easy to evict farthest when find closer!"
  Time: O(n log k) ✨
  ⬇

REALIZATION 2:
  "Can I do even better?"
  "Quickselect for O(n) average!"
  ⬇

OPTIMIZATION 3: Quickselect
  "Partition-based selection"
  Time: O(n) average - OPTIMAL! ✨

🎯 Approach 1: Brute Force - Sort All Points ⭐

The First Natural Solution

The Thought Process:

Step 1: Calculate distance for each point
  distance² = x² + y² (no need for sqrt)

Step 2: Sort points by distance (ascending)

Step 3: Return first k points

Simple and straightforward!

Visual Tracking - Complete Example

Input: points = [[1,3], [-2,2], [2,2]], k = 2

═══════════════════════════════════════════════════════════════
STEP 1: Calculate Distances (squared, no sqrt needed)
═══════════════════════════════════════════════════════════════

Point [1, 3]:
  distance² = 1² + 3² = 1 + 9 = 10
  (This is (1-0)² + (3-0)² since origin is (0,0))

Point [-2, 2]:
  distance² = (-2)² + 2² = 4 + 4 = 8

Point [2, 2]:
  distance² = 2² + 2² = 4 + 4 = 8

Points with distances:
  [[1,3], 10]
  [[-2,2], 8]
  [[2,2], 8]

═══════════════════════════════════════════════════════════════
STEP 2: Sort by Distance (Ascending)
═══════════════════════════════════════════════════════════════

Before sort:
  [[1,3]=10, [-2,2]=8, [2,2]=8]

After sort (by distance ascending):
  [[-2,2]=8, [2,2]=8, [1,3]=10]
   └────closest─────┘  └farther┘

═══════════════════════════════════════════════════════════════
STEP 3: Take First K
═══════════════════════════════════════════════════════════════

k = 2

Take first 2 points:
  [[-2,2], [2,2]]

Result: [[-2,2], [2,2]] ✓

Verification:
  Both have distance² = 8 (closest)
  Point [1,3] has distance² = 10 (farther)
  Correct! ✓

═══════════════════════════════════════════════════════════════

Implementation

import java.util.*;

/**
 * Brute Force: Sort all points
 * Time: O(n log n)
 * Space: O(1) auxiliary
 */
class Solution {
    public int[][] kClosest(int[][] points, int k) {
        // Sort by distance to origin
        Arrays.sort(points, (a, b) -> {
            int distA = a[0] * a[0] + a[1] * a[1];
            int distB = b[0] * b[0] + b[1] * b[1];
            return Integer.compare(distA, distB);  // Safe from overflow
        });

        // Return first k points
        return Arrays.copyOfRange(points, 0, k);
    }
}

⏰ Time: O(n log n) - Sorting dominates
💾 Space: O(1) auxiliary

Why This is Inefficient:

Problem: Sorts ALL n points just to get k!

Example: n = 10,000, k = 10
  Must sort all 10,000 points
  Only need 10 closest
  Sorted 9,990 unnecessarily! Wasteful!

Can we avoid full sort?

💡 The First AHA Moment

OBSERVATION:

"I'm sorting ALL points just to get k closest!"
"This is EXACTLY like Problem 278 (Top K Frequent)!"
"And Problem 281 (Kth Largest in Stream)!"

Pattern recognition:
  Top K Frequent → Use min-heap of size k
  Kth Largest → Use min-heap of size k
  K Closest → Use... heap of size k!

But wait - which heap?
  Closest = smallest distance
  So k closest = k smallest distances

  Should I use min-heap for smallest?

Let me think carefully about this...

🎯 Approach 2: Max-Heap of Size K (Optimal for Most Cases) ⭐⭐⭐

The Better Solution

The Evolution of Thought:

Brute Force: Sort all
  ⬇
Observation: "Only need k closest, not all sorted!"
  ⬇
Pattern: "Like Top K problems!"
  ⬇
Critical Question: "Min-heap or max-heap?"
  ⬇
Answer: "MAX-HEAP for k smallest distances!"

WHY Max-Heap for K Closest - Complete Textbook Reasoning

The Core Question:

For k CLOSEST points (k smallest distances),
should we use min-heap or max-heap?

Intuition says: "Smallest distances → min-heap!"
But that's WRONG!

Let me prove why MAX-HEAP is correct...

Complete Answer with Proof:

PRINCIPLE 1: Understanding the Goal
═══════════════════════════════════════════════════════════════

Goal: Track k points with SMALLEST distances (k closest)

Strategy: Keep heap of exactly k elements
  - These k elements = k points with smallest distances
  - Need to decide: which element is "on top"?

═══════════════════════════════════════════════════════════════
PRINCIPLE 2: Comparison with K Largest
═══════════════════════════════════════════════════════════════

Let's compare with a known pattern:

Problem 281: Kth LARGEST element
  Goal: Track k largest values
  Heap contains: k largest values
  Top of heap: Smallest of k largest = kth largest
  Heap type: MIN-heap ✓

  Why min-heap?
    - Top = smallest of k largest
    - When new larger value arrives, evict this smallest
    - Easy to find what to evict!

Problem 283: K CLOSEST points  
  Goal: Track k smallest distances
  Heap contains: k smallest distances
  Top of heap: Largest of k smallest = kth smallest
  Heap type: MAX-heap ✓

  Why max-heap?
    - Top = largest of k smallest
    - When new smaller value arrives, evict this largest
    - Easy to find what to evict!

Pattern: Use "OPPOSITE" heap!
  - k largest → MIN-heap
  - k smallest → MAX-heap

═══════════════════════════════════════════════════════════════
PRINCIPLE 3: Max-Heap Strategy Explained
═══════════════════════════════════════════════════════════════

With MAX-HEAP of size k:

Invariant maintained:
  - Heap contains exactly k points with SMALLEST distances
  - Top element = point with LARGEST distance among these k
  - Top = "boundary" = kth smallest distance

When new point arrives with distance d:

  CASE 1: Heap size < k
    Decision: Add unconditionally
    Why? Still building to k elements

  CASE 2: Heap size = k, and d < top (max)
    Observation: New distance smaller than current boundary
    Decision: New point is closer than current kth closest
    Action: Remove top (farthest of k), add new point
    Result: Heap still has k closest ✓

  CASE 3: Heap size = k, and d ≥ top (max)
    Observation: New distance ≥ boundary
    Decision: New point NOT in k closest
    Action: Skip (don't add)
    Result: Heap unchanged ✓

═══════════════════════════════════════════════════════════════
CONCRETE EXAMPLE: Max-Heap in Action
═══════════════════════════════════════════════════════════════

k = 2
Current heap distances: [8, 5]
Max-heap structure: top = 8 (larger)

Visualization:
  Heap represents: 2 closest points so far
  Point with dist 5: closest
  Point with dist 8: 2nd closest (farthest of 2 closest)
  Top = 8 = "boundary"

Scenario 1: Add point with distance 6
  Compare: 6 vs 8 (top)
  6 < 8? YES

  Analysis:
    New point closer than current boundary (8)
    New point should be in top 2!
    Current point with dist 8 should be evicted

  Action:
    Remove top (dist 8)
    Add new (dist 6)

  New heap: [6, 5]
  New top: 6
  New boundary: 6 ✓

Scenario 2: Add point with distance 10
  Compare: 10 vs 6 (top)
  10 < 6? NO

  Analysis:
    New point farther than boundary
    Not in top 2 closest

  Action: Skip
  Heap: [6, 5] (unchanged) ✓

═══════════════════════════════════════════════════════════════
WHY NOT Min-Heap? (Proof by Counter-Example)
═══════════════════════════════════════════════════════════════

Attempt: Use MIN-HEAP of size k for k smallest

Min-heap structure:
  Top = SMALLEST element in heap

Problem Scenario:
  Current heap: [5, 8]
  Min-heap: top = 5

  Add new point with distance 6:
    Compare: 6 vs 5 (top)
    6 > 5... now what?

  Analysis:
    Should we add 6? YES (it's 3rd smallest so far)
    Should we remove 5? NO (5 is smaller, keep it!)
    Should we remove 8? YES (8 is the largest we have)

  Problem:
    8 is NOT at the top!
    8 is buried somewhere in the heap!
    Min-heap gives us minimum, but we need MAXIMUM!
    Can't easily access what to evict! ✗

Fundamental flaw:
  Min-heap optimized for: finding minimum
  What we need: finding maximum (to evict)
  Min-heap is WRONG tool for this job! ✗

═══════════════════════════════════════════════════════════════
INTUITION: Boundary Visualization
═══════════════════════════════════════════════════════════════

Think of k closest points forming a circle around origin:

       Origin
         •
        /|\
       / | \
      /  •  \  ← k closest points
     •───•───•
         ↑
     boundary point
     (kth closest = farthest of k closest)

The boundary point:
  - Farthest among k closest
  - kth smallest distance
  - Decision criterion!

Max-heap keeps boundary point on top!

Any point:
  Inside boundary (dist < top) → add to k
  Outside boundary (dist ≥ top) → skip

Max-heap top = boundary = decision point! ✓

═══════════════════════════════════════════════════════════════
COMPARISON TABLE
═══════════════════════════════════════════════════════════════

┌──────────────────┬──────────────────┬──────────────────┐
│ Aspect           │ Min-Heap         │ Max-Heap         │
├──────────────────┼──────────────────┼──────────────────┤
│ Top Element      │ Smallest in heap │ Largest in heap  │
│ For k closest    │ Closest point    │ Farthest of k    │
│ Add decision     │ Compare with min │ Compare with max │
│ When to add?     │ Unclear!         │ If dist < top    │
│ What to evict?   │ Need max (hard!) │ Top (easy!)      │
│ Correctness      │ ✗ WRONG          │ ✓ CORRECT        │
└──────────────────┴──────────────────┴──────────────────┘

═══════════════════════════════════════════════════════════════
SUMMARY: Complete Logic
═══════════════════════════════════════════════════════════════

Goal: Find k smallest distances (k closest points)

Solution: Max-heap of size k

Why max-heap?
  1. Contains k smallest distances
  2. Top = largest of these k
  3. Top = kth smallest = boundary
  4. Easy check: if new < top → closer than boundary
  5. Easy evict: remove top (farthest of k)

Algorithm:
  for each point:
    if heap.size < k:
      add point
    else if distance < heap.top:
      remove heap.top
      add point
    else:
      skip

This is SAME pattern as:
  - Problem 281: Min-heap for k largest
  - Problem 278: Min-heap for k frequent

Universal pattern: "Opposite heap for boundary"! ✨

Visual Tracking - Complete Example

Input: points = [[1,3], [-2,2], [2,2], [5,1]], k = 2

Distances:
  [1,3]: 10
  [-2,2]: 8  
  [2,2]: 8
  [5,1]: 26

═══════════════════════════════════════════════════════════════
BUILD MAX-HEAP (SIZE k=2)
═══════════════════════════════════════════════════════════════

─────────────────────────────────────────────────────────────
Iteration 1: Process [1,3], dist=10
─────────────────────────────────────────────────────────────

Current heap: []
Heap size: 0 < k=2

Decision: Add unconditionally (building to k)

Action: Add [1,3]

Heap state:
  [[1,3]]
  dist=10

Size: 1

─────────────────────────────────────────────────────────────
Iteration 2: Process [-2,2], dist=8
─────────────────────────────────────────────────────────────

Current heap: [[1,3]=10]
Size: 1 < k=2

Decision: Add unconditionally

Action: Add [-2,2]

Heap state (max-heap):
  [[1,3]=10, [-2,2]=8]
   ↑ top (max)

Size: 2 = k (full now!)

Explanation:
  Max-heap property: parent ≥ children
  10 > 8, so [1,3] on top

─────────────────────────────────────────────────────────────
Iteration 3: Process [2,2], dist=8
─────────────────────────────────────────────────────────────

Current heap: [[1,3]=10, [-2,2]=8]
Top: 10
Size: 2 = k

Compare new distance with top:
  8 vs 10
  8 < 10? YES

Analysis:
  New point [2,2] has distance 8
  Top has distance 10
  8 < 10 → new point CLOSER than farthest!
  New point should be in k closest
  Old farthest [1,3] should be evicted

Decision: Replace top

Action:
  Step 1: poll() removes top [1,3]
    Heap: [[-2,2]=8]

  Step 2: offer([2,2])
    Heap: [[-2,2]=8, [2,2]=8]

Size: 2

Heap state:
  Both distances equal (8)
  Either can be on top
  Max-heap: [[2,2]=8, [-2,2]=8] or [[-2,2]=8, [2,2]=8]
  Top = 8

─────────────────────────────────────────────────────────────
Iteration 4: Process [5,1], dist=26
─────────────────────────────────────────────────────────────

Current heap: top = 8
Size: 2 = k

Compare new distance with top:
  26 vs 8
  26 < 8? NO

Analysis:
  New point [5,1] has distance 26
  Top has distance 8
  26 > 8 → new point FARTHER than boundary!
  New point NOT in k closest

Decision: Skip (don't add)

Action: None

Heap: [[-2,2]=8, [2,2]=8] (unchanged)
Size: 2

═══════════════════════════════════════════════════════════════
RESULT
═══════════════════════════════════════════════════════════════

Final heap: [[-2,2], [2,2]]

Verification:
  Both have distance 8
  These are the 2 closest to origin! ✓

Operations performed:
  - 4 points processed
  - 3 additions to heap
  - 1 removal from heap
  - Time: O(4 log 2) = O(4)

Compare with sorting:
  - Sort 4 points: O(4 log 4) ≈ O(8)
  - Heap is better even for small n!

Key observations:
  ✓ Heap never exceeded size k
  ✓ Top always = farthest of k closest
  ✓ Easy decision: compare with top
  ✓ Efficient eviction: just poll()

Implementation

import java.util.*;

/**
 * Max-Heap of size k
 * Time: O(n log k)
 * Space: O(k)
 */
class Solution {
    public int[][] kClosest(int[][] points, int k) {
        // Max-heap by distance (farthest on top)
        PriorityQueue<int[]> maxHeap = new PriorityQueue<>((a, b) -> {
            int distA = a[0] * a[0] + a[1] * a[1];
            int distB = b[0] * b[0] + b[1] * b[1];
            return Integer.compare(distB, distA);  // Max-heap, safe from overflow
        });

        for (int[] point : points) {
            maxHeap.offer(point);

            // Keep heap size ≤ k
            if (maxHeap.size() > k) {
                maxHeap.poll();  // Remove farthest
            }
        }

        // Extract k closest points
        int[][] result = new int[k][2];
        for (int i = 0; i < k; i++) {
            result[i] = maxHeap.poll();
        }

        return result;
    }

    public static void main(String[] args) {
        Solution sol = new Solution();
        int[][] result = sol.kClosest(new int[][]{{1,3},{-2,2}}, 1);
        System.out.println(Arrays.deepToString(result)); // [[-2,2]]
    }
}

⏰ Time: O(n log k)
💾 Space: O(k)

Why This is Better:

✓ O(n log k) vs O(n log n)
✓ Only k points in heap vs n sorted
✓ Much better when k << n

Example: n = 10,000, k = 10
  Sort: 10,000 × log(10,000) ≈ 130,000 ops
  Heap: 10,000 × log(10) ≈ 33,000 ops
  4x faster! ✨

Space: 10 vs 10,000 points
  100x less memory!

💡 The Second AHA Moment

OBSERVATION:

Max-heap is O(n log k), better than O(n log n)!
But can we do even better?

From Problem 275 (Kth Largest Element):
  "Quickselect achieves O(n) average time!"

Can I use quickselect here too?

YES! Quickselect works for kth smallest element!

When is it better?
  Max-heap: O(n log k)
  Quickselect: O(n) average

When k is large (k ≈ n/2):
  log k becomes significant
  O(n log k) approaches O(n log n)
  Quickselect's O(n) is better!

Let's understand how quickselect works for this problem...

🎯 Approach 3: Quickselect (Optimal for Large K) ⭐⭐⭐

The Optimal Solution

The Evolution of Thought:

Sort all: O(n log n)
  ⬇
Max-heap: O(n log k)
  ⬇
Observation: "Quickselect for O(n) average!"
  ⬇
Question: "How does it work?"
  ⬇
Optimal: "Partition-based selection!"

Understanding Quickselect - Complete Explanation

How does quickselect work here?

QUICKSELECT CONCEPT:

Similar to quicksort, but ONLY recurse on one side!

Goal: Partition array so that:
  - First k positions contain k closest points
  - Remaining positions contain farther points
  - Don't need full sort! Just need correct k in first k positions

Strategy (like quicksort partitioning):
  1. Pick pivot point
  2. Partition around pivot distance:
     - Points closer than pivot → left
     - Points farther than pivot → right
  3. Check pivot's final position:
     - If position = k → DONE! First k are k closest
     - If position < k → need more, recurse RIGHT
     - If position > k → have too many, recurse LEFT

Average case: O(n)
  T(n) = n + T(n/2)
       = n + n/2 + n/4 + ...
       ≈ 2n = O(n) ✓

Much better than O(n log k) when k is large! ✨

WHY Quickselect Works - Detailed Reasoning

The Core Question:

How does partitioning help find k closest?
Why don't we need full sort?

Answer Through Reasoning:

PRINCIPLE: Partial Ordering is Sufficient
═══════════════════════════════════════════════════════════════

For k closest points, we need:
  - k points with smallest distances in first k positions
  - Order within first k doesn't matter!
  - Order of remaining points doesn't matter!

Full sort gives:
  [1st closest, 2nd, 3rd, ..., kth, k+1th, ..., nth]
  Time: O(n log n)

Quickselect gives:
  [k closest in any order] [remaining in any order]
  Time: O(n) average

We only need the first part! ✓

═══════════════════════════════════════════════════════════════
HOW PARTITIONING WORKS
═══════════════════════════════════════════════════════════════

Partition operation:
  1. Choose pivot point with distance d_pivot
  2. Rearrange array:
     [points with dist < d_pivot] [pivot] [points with dist > d_pivot]
  3. Pivot ends at position p

After partition:
  - All points at positions [0, p-1] are closer than pivot
  - Point at position p is the pivot
  - All points at positions [p+1, n-1] are farther than pivot

Check position p:

  If p = k:
    First k positions have k closest! ✓
    DONE!

  If p < k:
    We have p closest so far
    Need k - p more
    They must be in RIGHT partition
    Recurse on [p+1, right]

  If p > k:
    We have too many (p points)
    The k closest are among first p
    Recurse on [left, p-1]

═══════════════════════════════════════════════════════════════
WHY O(n) Average Time?
═══════════════════════════════════════════════════════════════

Best case: Pivot always splits array in half

Recursion:
  T(n) = partition time + recurse on half
       = O(n) + T(n/2)
       = n + n/2 + n/4 + n/8 + ...
       = n × (1 + 1/2 + 1/4 + ...)
       = n × 2
       = O(n) ✓

Geometric series sums to constant!

With random pivot, average case = O(n)

Worst case: O(n²) (sorted array, always bad pivot)
  But with random pivot, probability is low

═══════════════════════════════════════════════════════════════
WHEN IS QUICKSELECT BETTER THAN HEAP?
═══════════════════════════════════════════════════════════════

Comparison:
  Max-heap: O(n log k)
  Quickselect: O(n) average

When k is small (k << n):
  log k ≈ log(10) ≈ 3
  O(n log k) ≈ O(3n) ≈ O(n)
  Both similar, heap is simpler

When k is large (k ≈ n/2):
  log k ≈ log(n/2) ≈ log n - 1
  O(n log k) ≈ O(n log n)
  Quickselect O(n) much better! ✓

Rule of thumb:
  k < 100: Use max-heap (simpler)
  k > n/2: Use quickselect (faster)

Space:
  Max-heap: O(k)
  Quickselect: O(1)
  Quickselect better for space! ✓

Visual Tracking - Complete Example

Input: points = [[1,3], [-2,2], [2,2], [5,1]], k = 2

Distances:
  [1,3]: 10
  [-2,2]: 8
  [2,2]: 8
  [5,1]: 26

Goal: Rearrange so first k=2 positions have 2 closest points

═══════════════════════════════════════════════════════════════
INITIAL STATE
═══════════════════════════════════════════════════════════════

Array: [[1,3], [-2,2], [2,2], [5,1]]
Dist:    10      8       8      26
Index:   0       1       2      3

Target: First 2 positions should have 2 closest

═══════════════════════════════════════════════════════════════
PARTITION 1: Range [0, 3], pivot = last element
═══════════════════════════════════════════════════════════════

Pivot: [5,1] at index 3, distance = 26

Partition process:
  i = 0 (tracks position for smaller elements)

  j = 0: [1,3], dist = 10
    10 < 26? YES → swap with position i=0
    Array: [[1,3], [-2,2], [2,2], [5,1]]
    i becomes 1

  j = 1: [-2,2], dist = 8
    8 < 26? YES → swap with position i=1
    Array: [[1,3], [-2,2], [2,2], [5,1]]
    i becomes 2

  j = 2: [2,2], dist = 8
    8 < 26? YES → swap with position i=2
    Array: [[1,3], [-2,2], [2,2], [5,1]]
    i becomes 3

  Place pivot at position i=3:
    Swap [5,1] with [5,1] (no change)

Array after partition:
  [[1,3], [-2,2], [2,2], [5,1]]
   └──────closer than 26──┘  └26┘
   positions 0,1,2          pos 3

Pivot position: p = 3

Check: p vs k
  3 > 2 → Too many! k closest are in LEFT
  Recurse on [0, 2]

═══════════════════════════════════════════════════════════════
PARTITION 2: Range [0, 2], pivot = last element
═══════════════════════════════════════════════════════════════

Array segment: [[1,3], [-2,2], [2,2]]
                 10      8       8

Pivot: [2,2] at index 2, distance = 8

Partition process:
  i = 0

  j = 0: [1,3], dist = 10
    10 < 8? NO → don't swap
    i stays 0

  j = 1: [-2,2], dist = 8
    8 < 8? NO → don't swap
    i stays 0

  Place pivot at position i=0:
    Swap [1,3] and [2,2]
    Array: [[2,2], [-2,2], [1,3], [5,1]]
            8      8      10     26

Pivot position: p = 0

Check: p vs k
  0 < 2 → Need more! 
  Recurse on [1, 2]

═══════════════════════════════════════════════════════════════
PARTITION 3: Range [1, 2], pivot = last element
═══════════════════════════════════════════════════════════════

Array segment: [[-2,2], [1,3]]
                  8      10

Pivot: [1,3] at index 2, distance = 10

Partition process:
  i = 1

  j = 1: [-2,2], dist = 8
    8 < 10? YES → swap with position i=1
    Array: [[2,2], [-2,2], [1,3], [5,1]]
    i becomes 2

  Place pivot at position i=2:
    Swap [1,3] with [1,3] (no change)

Pivot position: p = 2

Check: p vs k
  2 = 2 → Perfect! DONE! ✓

═══════════════════════════════════════════════════════════════
FINAL STATE
═══════════════════════════════════════════════════════════════

Array: [[2,2], [-2,2], [1,3], [5,1]]
        └─k=2─┘       └─rest─┘

First k=2 positions: [[2,2], [-2,2]]
Both have distance 8 ✓

Verification:
  Distances: [8, 8, 10, 26]
  First 2 are smallest! ✓

═══════════════════════════════════════════════════════════════
COMPLEXITY ANALYSIS
═══════════════════════════════════════════════════════════════

Partition costs:
  Partition 1: Process 4 elements → O(4)
  Partition 2: Process 3 elements → O(3)
  Partition 3: Process 2 elements → O(2)

Total: O(4 + 3 + 2) = O(9) ≈ O(n)

General formula:
  n + n/2 + n/4 + ... ≈ 2n = O(n) ✓

Compare with other approaches:
  Sort: O(4 log 4) ≈ O(8)
  Heap: O(4 log 2) ≈ O(4 × 1) = O(4)
  Quickselect: O(9)

For small n, all similar.

For large n with large k:
  n = 10,000, k = 5,000
  Heap: 10,000 × log(5,000) ≈ 120,000 ops
  Quickselect: ≈ 20,000 ops
  6x faster! 🚀

Implementation

import java.util.*;

/**
 * Quickselect (Optimal for large k)
 * Time: O(n) average, O(n²) worst
 * Space: O(1)
 */
class Solution {
    public int[][] kClosest(int[][] points, int k) {
        // Partition so first k positions have k closest
        quickselect(points, 0, points.length - 1, k);

        // Return first k points
        return Arrays.copyOfRange(points, 0, k);
    }

    /**
     * Partition array so first k positions have k smallest distances
     * 
     * @param points - array of points
     * @param left - left boundary of range to partition
     * @param right - right boundary of range to partition  
     * @param k - number of closest points needed
     */
    private void quickselect(int[][] points, int left, int right, int k) {
        // Base case: range has 0 or 1 element
        if (left >= right) return;

        // Partition around pivot
        int pivotIndex = partition(points, left, right);

        // Check pivot position
        if (pivotIndex == k) {
            // Perfect! First k positions have k closest
            return;
        } else if (pivotIndex < k) {
            // Need more points, recurse right
            quickselect(points, pivotIndex + 1, right, k);
        } else {
            // Have too many, recurse left
            quickselect(points, left, pivotIndex - 1, k);
        }
    }

    /**
     * Partition points around pivot
     * Points closer than pivot go left, farther go right
     * 
     * @return final position of pivot
     */
    private int partition(int[][] points, int left, int right) {
        // Choose last element as pivot
        int[] pivot = points[right];
        int pivotDist = distance(pivot);

        // i tracks next position for smaller elements
        int i = left;

        // Scan through range
        for (int j = left; j < right; j++) {
            // If current point closer than pivot
            if (distance(points[j]) < pivotDist) {
                // Swap to left section
                swap(points, i, j);
                i++;
            }
        }

        // Place pivot in final position
        swap(points, i, right);

        // Return pivot's final position
        return i;
    }

    /**
     * Calculate squared distance to origin
     * No sqrt needed for comparison
     */
    private int distance(int[] point) {
        return point[0] * point[0] + point[1] * point[1];
    }

    /**
     * Swap two points in array
     */
    private void swap(int[][] points, int i, int j) {
        int[] temp = points[i];
        points[i] = points[j];
        points[j] = temp;
    }

    public static void main(String[] args) {
        Solution sol = new Solution();
        int[][] result = sol.kClosest(new int[][]{{3,3},{5,-1},{-2,4}}, 2);
        System.out.println(Arrays.deepToString(result));
    }
}

⏰ Time: O(n) average, O(n²) worst
💾 Space: O(1) in-place

Why This is Optimal (for large k):

✓ O(n) average vs O(n log k) heap
✓ O(1) space vs O(k) space  
✓ In-place partitioning
✓ No extra data structures

When to use:
  Small k (< 100): Max-heap simpler, similar speed
  Large k (> n/2): Quickselect faster
  Space critical: Quickselect (O(1))
  Simplicity preferred: Max-heap easier

Example advantage (n=10,000, k=5,000):
  Heap: 10,000 × log(5,000) ≈ 120,000 ops
  Quickselect: ≈ 20,000 ops
  6x faster! 🚀

📊 Approach Comparison - The Complete Growth Journey

┌──────────────┬─────────────┬──────────┬──────────────────────┐
│ Approach     │ Time        │ Space    │ Key Insight          │
├──────────────┼─────────────┼──────────┼──────────────────────┤
│ Sort all     │ O(n log n)  │ O(1)     │ Simple but wasteful  │
│ Max-heap k   │ O(n log k)  │ O(k)     │ Only track k         │
│ Quickselect  │ O(n) avg    │ O(1)     │ Partition-based      │
└──────────────┴─────────────┴──────────┴──────────────────────┘

THE COMPLETE LEARNING PROGRESSION:

Level 1: Brute Force
  Thought: "Sort all by distance, take k"
  Works? YES ✓
  Optimal? NO ✗
  Time: O(n log n)
  Problem: Sorts all when only need k
  Learning: Understand problem, get working solution

Level 2: Pattern Recognition
  Observation: "Only need k closest!"
  Pattern: "Like Top K problems!"
  Connection: Problems 275, 278, 281
  Key: "Closest = smallest distance"
  Insight: Use heap-based approach!

Level 3: Max-Heap Solution
  Implementation: Max-heap of size k
  WHY max?: Top = farthest of k closest = boundary
  Result: O(n log k) ✓
  Better: Much better when k << n!
  Trade-off: O(k) space vs O(1)

Level 4: Further Optimization
  Observation: "Can use quickselect!"
  Pattern: "Like Problem 275!"
  Idea: "Partition for O(n) average!"
  When better?: k large (k ≈ n/2)

Level 5: Optimal Solution
  Implementation: Quickselect
  Result: O(n) average ✓
  Perfect: Can't do better asymptotically! ✨
  Growth: Learned when each approach is best!

CONCRETE EXAMPLE (n=10,000, k=100):

Sort all:
  10,000 × log(10,000) ≈ 130,000 ops
  Baseline

Max-heap:
  10,000 × log(100) ≈ 67,000 ops
  2x faster than sort! ✨

Quickselect:
  10,000 + 5,000 + 2,500 + ... ≈ 20,000 ops
  6.5x faster than sort!
  3x faster than heap! 🚀

When k=5,000 (large k):
  Heap: 10,000 × log(5,000) ≈ 120,000 ops (close to sort!)
  Quickselect: Still ≈ 20,000 ops
  Quickselect 6x better! ✨

💡 Key Learnings - Your Complete Growth

WHAT YOU LEARNED:

1. DISTANCE FORMULA CLARIFICATION:
   ✓ General: √((x₂-x₁)² + (y₂-y₁)²)
   ✓ To origin: √(x² + y²) [since origin is (0,0)]
   ✓ In code: x*x + y*y [since (x-0)=x, (y-0)=y]

2. DISTANCE OPTIMIZATION:
   ✓ Don't need sqrt for comparison
   ✓ √a < √b ⟺ a < b (monotonic property)
   ✓ Compare x² + y² directly
   ✓ Faster and more accurate!

3. WHY MAX-HEAP FOR K SMALLEST:
   ✓ Closest = smallest distance
   ✓ Track k smallest distances
   ✓ Top = largest of k smallest = boundary
   ✓ Easy eviction: remove top when find closer
   ✓ O(log k) per operation

4. WHY NOT MIN-HEAP:
   ✗ Top = smallest, but need to evict largest!
   ✗ Can't easily find what to remove
   ✗ Wrong data structure for this problem

5. PATTERN: "Opposite Heap for Boundary"
   ✓ k largest → min-heap (boundary on small side)
   ✓ k smallest → max-heap (boundary on large side)
   ✓ Top always = boundary for easy comparison
   ✓ Same pattern as Problems 275, 278, 281!

6. QUICKSELECT UNDERSTANDING:
   ✓ Partial ordering sufficient
   ✓ Partition-based selection
   ✓ O(n) average time
   ✓ Better for large k

7. WHEN TO USE EACH APPROACH:
   ✓ Small k (<100): Max-heap simpler, similar speed
   ✓ Large k (>n/2): Quickselect better
   ✓ Need streaming: Max-heap (can't quickselect stream)
   ✓ One-time query: Either (based on k size)
   ✓ Simple code: Sort acceptable for interviews

8. INTEGER OVERFLOW SAFETY:
   ✓ Use Integer.compare(distB, distA)
   ✓ NOT distB - distA (can overflow!)
   ✓ Constraint allows dist² up to 200M
   ✓ But subtraction can exceed int range

⚠️ Common Mistakes

Mistake 1: Using min-heap for k closest

// ❌ WRONG - Min-heap for k smallest distances
PriorityQueue<int[]> minHeap = new PriorityQueue<>((a,b) -> 
    Integer.compare(distance(a), distance(b)));

// Why wrong?
// Min-heap gives smallest, but we need to evict LARGEST
// Can't easily find what to remove!

// ✓ CORRECT - Max-heap for k smallest distances
PriorityQueue<int[]> maxHeap = new PriorityQueue<>((a,b) -> {
    int distA = a[0]*a[0] + a[1]*a[1];
    int distB = b[0]*b[0] + b[1]*b[1];
    return Integer.compare(distB, distA);  // Max-heap
});

Mistake 2: Computing sqrt unnecessarily

// ❌ WRONG - Expensive and unnecessary
double dist = Math.sqrt(x*x + y*y);

// ✓ CORRECT - Compare squared distances
int distSq = x*x + y*y;

Mistake 3: Integer overflow in comparator

// ❌ DANGEROUS - Can overflow!
return distB - distA;

// Why dangerous?
// distA, distB each up to 200,000,000 (fits in int)
// But distB - distA can overflow!
// Example: distB = 200M, distA = -200M → overflow

// ✓ SAFE - Uses Integer.compare
return Integer.compare(distB, distA);

Mistake 4: Wrong understanding of distance formula

// ❌ WRONG - Treating as general two-point distance
int dist = (x2-x1)*(x2-x1) + (y2-y1)*(y2-y1);

// ✓ CORRECT - One point is origin (0, 0)!
int dist = x*x + y*y;  // This IS (x-0)² + (y-0)²

Mistake 5: Not maintaining heap size

// ❌ WRONG - Heap grows unbounded
maxHeap.offer(point);

// ✓ CORRECT - Maintain size ≤ k
maxHeap.offer(point);
if (maxHeap.size() > k) {
    maxHeap.poll();
}

Mistake 6: Wrong quickselect termination

// ❌ WRONG - Doesn't handle pivot = k
if (pivotIndex < k) {
    quickselect(left, pivotIndex - 1, k);
}

// ✓ CORRECT - Check all three cases
if (pivotIndex == k) {
    return;  // Done!
} else if (pivotIndex < k) {
    quickselect(pivotIndex + 1, right, k);
} else {
    quickselect(left, pivotIndex - 1, k);
}

🎯 Pattern Recognition

Problem Type: K Smallest/Closest Selection
Core Pattern: Max-Heap for K Smallest

When to Apply:
✓ Find k smallest/closest elements
✓ Distance-based selection
✓ Don't need sorted order
✓ Can optimize distance calculation

Recognition Keywords:
- "k closest"
- "k smallest"  
- "distance to origin"
- "nearest points"
- "k points with smallest"

Similar Problems:
- Kth Largest Element (LC 215) - Min-heap for k largest
- Top K Frequent (LC 347) - Min-heap for k frequent
- Kth Largest in Stream (LC 703) - Min-heap for k largest
- Find K Pairs with Smallest Sums (LC 373)
- Kth Smallest in Sorted Matrix (LC 378)

Key Components:
┌────────────────────────────────────────────┐
│ Distance: x² + y² (no sqrt needed)        │
│ Max-Heap: For k smallest distances        │
│ Top: Largest of k smallest = boundary    │
│ Quickselect: O(n) average alternative     │
│ Pattern: "Opposite heap for boundary"    │
└────────────────────────────────────────────┘

Universal Pattern Across "Top K" Problems:
  Track k LARGEST → use MIN-heap
  Track k SMALLEST → use MAX-heap

Why? Keep boundary at top for easy comparison!

📝 Quick Revision Notes

🎯 Core Concept:

Find k closest points to origin. Distance to origin: √(x² + y²) [not √((x₂-x₁)²+(y₂-y₁)²) because origin is (0,0)]. Skip sqrt: Compare x²+y² directly. Sort: O(n log n) - wasteful. Max-heap of k: Track k smallest distances, top = farthest of k closest = boundary. NOT min-heap: Can't decide eviction! Quickselect: O(n) average, partition-based, best for large k.

⚡ Quick Implementation:

import java.util.Arrays;
import java.util.PriorityQueue;

public class Solution {
  public int[][] kClosest(int[][] a, int k) {
    // return naive(a, k);
    // return maxHeap(a, k);

    quickselect(a, 0, a.length - 1, k);
    // Return first k points
    return Arrays.copyOfRange(a, 0, k);
  }

  private void quickselect(int[][] points, int left, int right, int k) {
    // Base case: range has 0 or 1 element
    if (left >= right)
      return;

    // Partition around pivot
    int pivotIndex = partition(points, left, right);

    // Check pivot position
    if (pivotIndex == k) {
      // Perfect! First k positions have k closest
      return;
    } else if (pivotIndex < k) {
      // Need more points, recurse right
      quickselect(points, pivotIndex + 1, right, k);
    } else {
      // Have too many, recurse left
      quickselect(points, left, pivotIndex - 1, k);
    }
  }

  private int partition(int[][] points, int left, int right) {
    // Choose last element as pivot
    int[] pivot = points[right];
    int pivotDist = distance(pivot);

    // i tracks next position for smaller elements
    int i = left;

    // Scan through range
    for (int j = left; j < right; j++) {
      // If current point closer than pivot
      if (distance(points[j]) < pivotDist) {
        // Swap to left section
        swap(points, i, j);
        i++;
      }
    }

    // Place pivot in final position
    swap(points, i, right);

    // Return pivot's final position
    return i;
  }

  private int distance(int[] point) {
    return point[0] * point[0] + point[1] * point[1];
  }

  private void swap(int[][] points, int i, int j) {
    int[] temp = points[i];
    points[i] = points[j];
    points[j] = temp;
  }

  private int[][] maxHeap(int[][] a, int k) {
    // step 1: create max heap with capacity k which
    // provides always greatest element when polled.
    PriorityQueue<int[]> pq = new PriorityQueue<>((p1, p2) -> {
      int distP1FromOrigin = p1[1] * p1[1] + p1[0] * p1[0];
      int distP2FromOrigin = p2[1] * p2[1] + p2[0] * p2[0];

      return Integer.compare(distP2FromOrigin, distP1FromOrigin);
    });

    // step 2: process data
    for (int[] curr : a) {
      pq.offer(curr);

      if (pq.size() > k) {
        pq.poll();
      }
    }

    // step 3: take first k results
    int[][] res = new int[k][2];
    for (int i = 0; i < k; i++) {
      int[] curr = pq.poll();
      res[i][0] = curr[0];
      res[i][1] = curr[1];
    }

    return res;
  }

  private int[][] naive(int[][] a, int k) {
    // step 1: sort the values in array in ascending order
    // per distance from origin
    // dist between (x,y) from (0,0) => sqrt(y^2+x^2)
    // Actual formula is sqrt((y-0)^2+(x-0)^2)
    // comparator: we do (a, b) -> a - b => ascending order
    Arrays.sort(a, (p1, p2) -> {
      int distP1FromOrigin = p1[1] * p1[1] + p1[0] * p1[0];
      int distP2FromOrigin = p2[1] * p2[1] + p2[0] * p2[0];

      return distP1FromOrigin - distP2FromOrigin;
    });

    // step 2: take first k results
    int[][] res = new int[k][2];
    for (int i = 0; i < k; i++) {
      res[i][0] = a[i][0];
      res[i][1] = a[i][1];
    }

    return res;
  }

  public static void main(String[] args) {
    Solution s = new Solution();

    System.out.println(Arrays.deepToString(s.kClosest(new int[][] { { 1, 3 }, { -2, 2 } }, 1)));
    System.out.println(Arrays.deepToString(s.kClosest(new int[][] { { 3, 3 }, { 5, -1 }, { -2, 4 } }, 2)));
  }
}

🔑 Key Insights:

Distance Formula: To origin (0,0) → √(x²+y²) not √((x₂-x₁)²+(y₂-y₁)²)
Why Simplifies: (x-0)=x, (y-0)=y, so formula becomes √(x²+y²)
Why Max-Heap?: Closest=smallest, track k smallest, top=largest of k smallest=boundary ✓
Why Not Min?: Top=smallest, but need evict largest! Can't find it easily ✗
Integer.compare: Safe from overflow, NOT distB-distA
No Sqrt: Compare squared distances (monotonic property)
Quickselect: Partial ordering, O(n) average, best for large k
Pattern: k largest→min-heap, k smallest→max-heap (opposite for boundary)
Time: Sort O(n log n), Heap O(n log k), Quick O(n) avg
Space: Sort O(1), Heap O(k), Quick O(1) ✓

🎪 Memory Aid:

"To origin? Just x²+y²!"
"Closest = smallest! Max-heap for k smallest!"
"Top = boundary = farthest of k closest!" ✨

🧪 Edge Cases

Case 1: k = n (all points)

Input: points = [[1,0],[0,1]], k = 2
Output: [[1,0],[0,1]] (any order)
All points returned

Case 2: k = 1 (single closest)

Input: points = [[1,3],[-2,2]], k = 1
Output: [[-2,2]]
Single closest point

Case 3: Equal distances

Input: points = [[1,0],[0,1],[-1,0],[0,-1]], k = 2
All at distance 1 from origin
Any 2 points valid

Case 4: Origin included

Input: points = [[0,0],[1,1]], k = 1
Output: [[0,0]]
Origin has distance 0 (closest possible)

Case 5: Large coordinates

Input: points = [[10000,10000],[-10000,-10000]], k = 1
Both have distance² = 200,000,000
Either valid (equal distance)
Must use Integer.compare to avoid overflow!

All handled correctly! ✓

🎓 Complexity Analysis

Approach 1: Sort All

Time: O(n log n)
  Calculate distances: O(n)
  Sort: O(n log n)
  Extract k: O(k)
  Total: O(n log n)

Space: O(1) auxiliary
  Sorting in-place (if allowed)

Approach 2: Max-Heap of K

Time: O(n log k)
  Process n points
  Each: offer O(log k) + maybe poll O(log k)
  Total: O(n log k)

Space: O(k)
  Heap contains at most k elements

When k << n: Much better!
  n=10,000, k=10
  Sort: O(130,000)
  Heap: O(33,000)
  4x faster! ✨

Approach 3: Quickselect

Time: O(n) average, O(n²) worst
  Average: n + n/2 + n/4 + ... ≈ 2n
  Worst: Sorted array, always pick bad pivot

Space: O(1)
  In-place partitioning
  No extra data structures

When k large: Best choice!
  n=10,000, k=5,000
  Heap: O(120,000)
  Quick: O(20,000)
  6x faster! 🚀

Optimal average case! Can't do better than O(n)!

Happy practicing! 🎯

Note: This problem PERFECTLY demonstrates the "max-heap for k smallest" pattern! You learn: (1) Distance to origin simplifies from √((x₂-x₁)²+(y₂-y₁)²) to √(x²+y²) because origin is (0,0), (2) WHY max-heap not min-heap (top=boundary=largest of k smallest for easy eviction), (3) WHY Integer.compare not subtraction (overflow safety), (4) Quickselect for O(n) average when k is large, (5) Pattern recognition across all "Top K" problems (opposite heap for boundary tracking). The complete reasoning - distance formula clarification, min vs max heap comparison with proof, quickselect partitioning logic, overflow discussion - builds production-ready problem-solving skills! True mastery! 💪✨🏆