279. Find K Pairs with Smallest Sums
π LeetCode Problem: 373. Find K Pairs with Smallest Sums
π Difficulty: Medium
π·οΈ Topics: Array, Heap (Priority Queue)
Problem Statement
You are given two integer arrays nums1 and nums2 sorted in non-decreasing order and an integer k.
Define a pair (u, v) which consists of one element from the first array and one element from the second array.
Return the k pairs (u1, v1), (u2, v2), ..., (uk, vk) with the smallest sums.
Example 1:
Input: nums1 = [1,7,11], nums2 = [2,4,6], k = 3
Output: [[1,2],[1,4],[1,6]]
Explanation: The first 3 pairs are returned from the sequence:
[1,2],[1,4],[1,6],[7,2],[7,4],[11,2],[7,6],[11,4],[11,6]
Example 2:
Input: nums1 = [1,1,2], nums2 = [1,2,3], k = 3
Output: [[1,1],[1,1],[1,2]]
Explanation: The first 3 pairs are returned from the sequence:
[1,1],[1,1],[2,1],[1,2],[1,2],[2,2],[1,3],[1,3],[2,3]
Example 3:
Input: nums1 = [1,2], nums2 = [3], k = 3
Output: [[1,3],[2,3]]
Explanation: All possible pairs are returned from the sequence: [1,3],[2,3]
Constraints:
- 1 <= nums1.length, nums2.length <= 10^5
- -10^9 <= nums1[i], nums2[i] <= 10^9
- nums1 and nums2 both are sorted in non-decreasing order
- 1 <= k <= 10^4
- k <= nums1.length * nums2.length
π Understanding the Problem
What are we finding?
nums1 = [1, 7, 11]
nums2 = [2, 4, 6]
All possible pairs (u, v):
(1,2) sum=3 β smallest
(1,4) sum=5
(1,6) sum=7
(7,2) sum=9
(7,4) sum=11
(7,6) sum=13
(11,2) sum=13
(11,4) sum=15
(11,6) sum=17
Total pairs: 3 Γ 3 = 9
k = 3 β Return 3 pairs with smallest sums
Answer: [[1,2], [1,4], [1,6]]
Key Observations:
1. Arrays are SORTED (non-decreasing)
β Smallest elements at start
β Largest elements at end
2. Total possible pairs: m Γ n
β m = nums1.length, n = nums2.length
β Can be HUGE (10^5 Γ 10^5 = 10^10!)
3. k is much smaller than m Γ n
β k β€ 10^4
β Don't need ALL pairs, just k smallest
4. Smallest pair: (nums1[0], nums2[0])
β Both arrays sorted
β First elements give minimum sum
π The Natural Thinking Process
When you first see this problem:
Question: Find k pairs with smallest sums
First Thought: "Generate all pairs, sort by sum, take k"
Step 1: Create all mΓn pairs
Step 2: Sort by sum
Step 3: Take first k pairs
Is this optimal? NO (can generate 10^10 pairs!), but it WORKS!
Let's analyze it first.
The Evolution:
BRUTE FORCE THINKING:
"Generate ALL pairs β Sort β Take k"
Time: O(mΓn log(mΓn)) for sorting
Space: O(mΓn) for storing all pairs
Problem: When m,n = 10^5, this is 10^10 pairs! π±
β¬
REALIZATION 1:
"I don't need ALL pairs, just k smallest!"
"Use min-heap to get k smallest!"
β¬
OPTIMIZATION 1: Min-Heap with All Pairs
"Add all mΓn pairs to heap"
"Poll k times"
Time: O(mΓn log(mΓn)) - still bad!
Same as sorting!
β¬
REALIZATION 2:
"Arrays are SORTED!"
"Smallest pairs come from small indices!"
"What if I generate pairs INCREMENTALLY?"
β¬
OPTIMIZATION 2: Min-Heap with Smart Generation
"Start with (nums1[0], nums2[0])"
"Generate next candidates smartly"
"Use heap to get smallest, generate more"
Time: O(k log k) - MUCH better! β¨
π― Approach 1: Brute Force - Generate All Pairs β
The First Natural Solution
The Thought Process:
Step 1: Read problem
"Find k pairs with smallest sums"
Step 2: How to find smallest pairs?
"Generate all possible pairs"
"Sort by sum"
"Take first k"
This generates mΓn pairs, but let's code it to understand!
Visual Tracking - Complete Example
Input: nums1 = [1, 7, 11], nums2 = [2, 4, 6], k = 3
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
STEP 1: Generate All Pairs
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
For each element in nums1:
For each element in nums2:
Create pair (nums1[i], nums2[j])
i=0, j=0: pair = (1, 2), sum = 3
i=0, j=1: pair = (1, 4), sum = 5
i=0, j=2: pair = (1, 6), sum = 7
i=1, j=0: pair = (7, 2), sum = 9
i=1, j=1: pair = (7, 4), sum = 11
i=1, j=2: pair = (7, 6), sum = 13
i=2, j=0: pair = (11, 2), sum = 13
i=2, j=1: pair = (11, 4), sum = 15
i=2, j=2: pair = (11, 6), sum = 17
All pairs:
[(1,2,3), (1,4,5), (1,6,7), (7,2,9), (7,4,11),
(7,6,13), (11,2,13), (11,4,15), (11,6,17)]
Total: 9 pairs
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
STEP 2: Sort by Sum
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Already sorted in this example!
(In general, need to sort)
Sorted: [(1,2,3), (1,4,5), (1,6,7), (7,2,9), (7,4,11), ...]
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
STEP 3: Take First K
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
k = 3
Take first 3:
(1, 2)
(1, 4)
(1, 6)
Result: [[1,2], [1,4], [1,6]] β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
RESULT
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Answer: [[1,2], [1,4], [1,6]]
Correct, but inefficient!
Implementation
import java.util.*;
/**
* Brute Force: Generate all pairs
* Time: O(mΓn log(mΓn))
* Space: O(mΓn)
*/
class Solution {
public List<List<Integer>> kSmallestPairs(int[] nums1, int[] nums2, int k) {
List<List<Integer>> result = new ArrayList<>();
// Step 1: Generate all pairs with sums
List<int[]> allPairs = new ArrayList<>();
for (int i = 0; i < nums1.length; i++) {
for (int j = 0; j < nums2.length; j++) {
allPairs.add(new int[]{nums1[i], nums2[j], nums1[i] + nums2[j]});
}
}
// Step 2: Sort by sum
allPairs.sort((a, b) -> a[2] - b[2]);
// Step 3: Take first k
for (int i = 0; i < Math.min(k, allPairs.size()); i++) {
result.add(Arrays.asList(allPairs.get(i)[0], allPairs.get(i)[1]));
}
return result;
}
}
β° Time: O(mΓn log(mΓn)) - Generate all + sort
πΎ Space: O(mΓn) - Store all pairs
Why This is Impractical:
Problem: Generates ALL pairs!
Example:
m = 10^5, n = 10^5
Total pairs: 10^10 (10 billion!)
Memory: 10^10 Γ 12 bytes β 120 GB! π±
Time: O(10^10 log 10^10) - IMPOSSIBLE!
But k β€ 10^4:
We only need 10,000 pairs
Generating 10 billion is WASTEFUL!
We must do better!
π‘ The First AHA Moment
PROBLEM ANALYSIS:
"I'm generating ALL mΓn pairs"
"But I only need k pairs!"
"And k can be MUCH smaller than mΓn"
Example: k=100, m=10000, n=10000
Generating 100,000,000 pairs
When I only need 100!
INSIGHT:
"Don't generate all pairs upfront!"
"Generate pairs on-demand using min-heap!"
But still, if I add all pairs to heap...
Same problem! O(mΓn log(mΓn))
Need something smarter...
π‘ The Second AHA Moment - The Critical Insight
OBSERVATION:
Arrays are SORTED!
nums1 = [1, 7, 11] (increasing)
nums2 = [2, 4, 6] (increasing)
All pairs with nums1[0]=1:
(1,2) sum=3 β smallest overall
(1,4) sum=5
(1,6) sum=7
All pairs with nums1[1]=7:
(7,2) sum=9
(7,4) sum=11
(7,6) sum=13
All pairs with nums1[2]=11:
(11,2) sum=13
(11,4) sum=15
(11,6) sum=17
PATTERN:
(1,2) < (1,4) < (1,6) β (same first, second increases)
(7,2) < (7,4) < (7,6) β
(1,2) < (7,2) < (11,2) β (same second, first increases)
KEY INSIGHT:
After picking (1,2), next smallest candidates are:
(1,4) β next in same row (increment j)
(7,2) β next row (increment i)
We don't need (1,6) yet!
We don't need (7,4) yet!
Generate pairs INCREMENTALLY as needed! β¨
π― Approach 2: Min-Heap with Smart Generation (Optimal) βββ
The Optimal Solution
The Evolution of Thought:
Brute Force: Generate all pairs
β¬
Question: "Can I avoid generating all pairs?"
β¬
Observation: "Arrays are sorted!"
β¬
Insight: "Generate pairs incrementally!"
β¬
Optimal: "Min-heap + smart candidate generation"
Understanding Smart Generation - Complete Explanation
How do we generate pairs incrementally?
THE BRILLIANT STRATEGY:
Think of pairs as a 2D matrix organized by indices:
j=0 j=1 j=2
(2) (4) (6) β nums2
i=0 (1,2) (1,4) (1,6) β i=0, nums1[0]=1
3 5 7
i=1 (7,2) (7,4) (7,6) β i=1, nums1[1]=7
9 11 13
i=2 (11,2) (11,4) (11,6) β i=2, nums1[2]=11
13 15 17
Observation:
- Each row increases left to right (j increases)
- Each column increases top to bottom (i increases)
- Smallest is at (0,0) = (1,2)
Strategy:
1. Start with (0,0)
2. After picking (i,j), add candidates:
- (i, j+1) β move right in same row
- (i+1, j) β move down in same column
3. Use min-heap to always pick smallest
4. Avoid duplicates with visited set
WHY This Generation Strategy Works - Critical Reasoning
The Core Question:
After picking pair (i, j), why do we only add:
- (i, j+1) β right neighbor
- (i+1, j) β down neighbor
Why not other candidates?
Answer Through Reasoning:
PRINCIPLE: Sorted Arrays Property
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Since both arrays are sorted:
nums1[i] β€ nums1[i+1] β€ nums1[i+2] ...
nums2[j] β€ nums2[j+1] β€ nums2[j+2] ...
For any pair (i, j):
sum = nums1[i] + nums2[j]
Neighboring sums:
(i, j+1): nums1[i] + nums2[j+1] β₯ nums1[i] + nums2[j] β
(i+1, j): nums1[i+1] + nums2[j] β₯ nums1[i] + nums2[j] β
Both neighbors have sums β₯ current!
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
WHY Only Right and Down?
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Consider the 2D matrix of sums:
j=0 j=1 j=2
i=0 3 5 7
i=1 9 11 13
i=2 13 15 17
If we're at (0,0) with sum=3:
- Right (0,1): sum=5
- Down (1,0): sum=9
- Diagonal (1,1): sum=11
But diagonal (1,1) = 11 > 9 = (1,0)
And to reach (1,1), we must go through either:
- (0,0) β (0,1) β (1,1)
- (0,0) β (1,0) β (1,1)
When we process (0,1) or (1,0), we'll add (1,1) then!
By only adding right and down neighbors:
- We ensure all reachable pairs are eventually added
- We add them in order (smaller sums first)
- We avoid adding too many candidates at once
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Duplicate Prevention - WHY We Need Visited Set
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Problem: Same pair can be reached from multiple paths!
Example:
(1,1) can be reached from:
- (0,1) moving down
- (1,0) moving right
Without visited set:
1. Process (0,0), add (0,1) and (1,0)
2. Process (0,1), add (0,2) and (1,1) β first time
3. Process (1,0), add (1,1) and (2,0) β duplicate (1,1)!
Heap would have (1,1) twice! β
With visited set:
When adding (i,j) to heap:
if (i,j) not in visited:
add to heap
mark as visited
This prevents duplicates! β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Initial Strategy - Start with First Row
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
WHY start with first row (i=0, j=0 to j=min(k-1, n-1))?
Option 1: Start with only (0,0)
- Must process many candidates to reach (0,k-1)
- Slower for large k
Option 2: Start with first k positions in first row
- [(0,0), (0,1), (0,2), ..., (0,k-1)]
- These are smallest k candidates from first element of nums1
- Gives us k starting points
- Faster to reach all k pairs!
Why first ROW not first COLUMN?
Convention - either works
But first row means:
- Fix nums1[0] (first element)
- Vary nums2[0..k-1]
This spreads candidates efficiently!
Example: k=3
Start: [(0,0), (0,1), (0,2)]
Heap has 3 candidates with smallest from nums1[0]
As we poll, we add down neighbors
Efficiently explores the space!
Visual Tracking - Complete Example
Input: nums1 = [1, 7, 11], nums2 = [2, 4, 6], k = 3
Index pairs visualization:
j=0(2) j=1(4) j=2(6)
i=0(1) 3 5 7
i=1(7) 9 11 13
i=2(11) 13 15 17
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
INITIALIZATION
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Start with first row (i=0), up to k elements:
k = 3, n = 3
Add: (0,0), (0,1), (0,2)
Min-Heap: [(0,0,3), (0,1,5), (0,2,7)]
(i,j,sum)
β
smallest
Visited: {(0,0), (0,1), (0,2)}
Result: []
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ITERATION 1: Poll Smallest
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Poll: (0, 0) with sum=3
Pair: (nums1[0], nums2[0]) = (1, 2)
Add to result: [[1, 2]]
Generate candidates from (0, 0):
Right: (0, 0+1) = (0, 1) β Already visited! Skip
Down: (0+1, 0) = (1, 0) β Not visited!
Add (1, 0) with sum = nums1[1] + nums2[0] = 7+2 = 9
Heap: [(0,1,5), (0,2,7), (1,0,9)]
β
smallest
Visited: {(0,0), (0,1), (0,2), (1,0)}
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ITERATION 2: Poll Smallest
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Poll: (0, 1) with sum=5
Pair: (nums1[0], nums2[1]) = (1, 4)
Add to result: [[1,2], [1,4]]
Generate candidates from (0, 1):
Right: (0, 1+1) = (0, 2) β Already visited! Skip
Down: (0+1, 1) = (1, 1) β Not visited!
Add (1, 1) with sum = 7+4 = 11
Heap: [(0,2,7), (1,0,9), (1,1,11)]
β
smallest
Visited: {(0,0), (0,1), (0,2), (1,0), (1,1)}
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ITERATION 3: Poll Smallest
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Poll: (0, 2) with sum=7
Pair: (nums1[0], nums2[2]) = (1, 6)
Add to result: [[1,2], [1,4], [1,6]]
Result size = 3 = k β STOP! β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
RESULT
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Result: [[1,2], [1,4], [1,6]]
Sums: [3, 5, 7] β All are smallest! β
Key observations:
β Only generated 6 candidates (not all 9 pairs)
β Used min-heap to always get smallest
β Generated incrementally as needed
β Visited set prevented duplicates
β Stopped as soon as k pairs found
Another Example - Showing Duplicate Prevention
Input: nums1 = [1, 2], nums2 = [3, 4], k = 4
Matrix:
j=0(3) j=1(4)
i=0(1) 4 5
i=1(2) 5 6
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
INITIALIZATION
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Start: (0,0), (0,1)
Heap: [(0,0,4), (0,1,5)]
Visited: {(0,0), (0,1)}
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ITERATION 1
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Poll: (0,0) sum=4
Result: [[1,3]]
Add candidates from (0,0):
Right: (0,1) β visited, skip
Down: (1,0) β not visited, add!
Heap: [(0,1,5), (1,0,5)]
Visited: {(0,0), (0,1), (1,0)}
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ITERATION 2
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Poll: (0,1) sum=5 (or (1,0), both have sum=5)
Let's say we poll (0,1)
Result: [[1,3], [1,4]]
Add candidates from (0,1):
Right: (0,2) β out of bounds, skip
Down: (1,1) β not visited, add!
Heap: [(1,0,5), (1,1,6)]
Visited: {(0,0), (0,1), (1,0), (1,1)}
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ITERATION 3
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Poll: (1,0) sum=5
Result: [[1,3], [1,4], [2,3]]
Add candidates from (1,0):
Right: (1,1) β ALREADY VISITED! Skip β Duplicate prevented!
Down: (2,0) β out of bounds, skip
Heap: [(1,1,6)]
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ITERATION 4
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Poll: (1,1) sum=6
Result: [[1,3], [1,4], [2,3], [2,4]]
k=4 reached! Stop.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Note: (1,1) was added only ONCE even though:
- (0,1) β down leads to (1,1)
- (1,0) β right leads to (1,1)
Visited set prevented the duplicate! β
Implementation
import java.util.*;
/**
* Min-Heap with smart generation (Optimal)
* Time: O(k log k)
* Space: O(k)
*/
class Solution {
public List<List<Integer>> kSmallestPairs(int[] nums1, int[] nums2, int k) {
List<List<Integer>> result = new ArrayList<>();
if (nums1.length == 0 || nums2.length == 0 || k == 0) {
return result;
}
// Min-heap: stores (i, j, sum)
PriorityQueue<int[]> minHeap = new PriorityQueue<>(
(a, b) -> a[2] - b[2] // Compare by sum
);
// Visited set to prevent duplicates
Set<String> visited = new HashSet<>();
// Initialize with first row (i=0, j=0 to min(k-1, n-1))
for (int j = 0; j < Math.min(k, nums2.length); j++) {
int sum = nums1[0] + nums2[j];
minHeap.offer(new int[]{0, j, sum});
visited.add("0," + j);
}
// Extract k smallest pairs
while (k > 0 && !minHeap.empty()) {
int[] curr = minHeap.poll();
int i = curr[0];
int j = curr[1];
result.add(Arrays.asList(nums1[i], nums2[j]));
k--;
// Add right neighbor (i, j+1)
if (j + 1 < nums2.length) {
String rightKey = i + "," + (j + 1);
if (!visited.contains(rightKey)) {
int sum = nums1[i] + nums2[j + 1];
minHeap.offer(new int[]{i, j + 1, sum});
visited.add(rightKey);
}
}
// Add down neighbor (i+1, j)
if (i + 1 < nums1.length) {
String downKey = (i + 1) + "," + j;
if (!visited.contains(downKey)) {
int sum = nums1[i + 1] + nums2[j];
minHeap.offer(new int[]{i + 1, j, sum});
visited.add(downKey);
}
}
}
return result;
}
public static void main(String[] args) {
Solution sol = new Solution();
System.out.println(sol.kSmallestPairs(new int[]{1,7,11}, new int[]{2,4,6}, 3));
// [[1,2],[1,4],[1,6]]
System.out.println(sol.kSmallestPairs(new int[]{1,1,2}, new int[]{1,2,3}, 3));
// [[1,1],[1,1],[1,2]]
}
}
β° Time: O(k log k) - k iterations, each O(log k) heap operation
πΎ Space: O(k) - Heap and visited set
Why This is Optimal:
β O(k log k) vs O(mΓn log(mΓn))
β Only generates pairs as needed
β Uses sorted property efficiently
β Stops as soon as k pairs found
Example improvement:
m = 10^5, n = 10^5, k = 100
Brute Force:
Generate: 10^10 pairs
Sort: 10^10 log(10^10)
IMPOSSIBLE! π±
Optimal:
Generate: ~300 pairs (3Γk)
Heap ops: 100 Γ log(100) β 700 ops
FAST! β¨
π Approach Comparison - The Complete Growth Journey
ββββββββββββββββ¬βββββββββββββββββββ¬βββββββββββ¬βββββββββββββββββββ
β Approach β Time β Space β Key Insight β
ββββββββββββββββΌβββββββββββββββββββΌβββββββββββΌβββββββββββββββββββ€
β Brute Force β O(mΓn log(mΓn)) β O(mΓn) β Generate all β
β Smart Heap β O(k log k) β O(k) β Incremental gen β
ββββββββββββββββ΄βββββββββββββββββββ΄βββββββββββ΄βββββββββββββββββββ
THE COMPLETE LEARNING PROGRESSION:
Level 1: Brute Force
Thought: "Generate all mΓn pairs β Sort β Take k"
Works? YES for small inputs β
Optimal? NO β
Problem: When m,n = 10^5, generates 10^10 pairs!
Level 2: Critical Observation
Question: "Do I need all pairs?"
Realization: "I only need k pairs!"
Also: "Arrays are SORTED!"
Level 3: Key Insight
Observation: "Smallest pairs near (0,0)"
Pattern: "Can generate incrementally!"
Strategy: "Start small, expand as needed!"
Level 4: Optimal Solution
Implementation: Min-heap + smart generation
Result: O(k log k) β
Perfect: Only generates ~3k pairs for k result! β¨
Growth: Learned incremental generation pattern!
CONCRETE EXAMPLE (m=10^5, n=10^5, k=100):
Brute Force:
Generate all: 10^10 pairs
Memory: ~120 GB
Time: IMPOSSIBLE! π±
Optimal:
Generate: ~300 pairs (as needed)
Memory: ~4 KB
Time: 100 Γ log(100) β 700 ops
30,000,000x fewer pairs generated! π
Memory reduced by 30,000,000x! β¨
π‘ Key Learnings - Your Complete Growth
WHAT YOU LEARNED:
1. INCREMENTAL GENERATION:
β Don't generate all upfront
β Generate as needed using heap
β Use sorted property to guide generation
2. WHY RIGHT AND DOWN NEIGHBORS:
β Sorted arrays β neighbors have larger sums
β Right: same i, increase j
β Down: increase i, same j
β All reachable pairs eventually added
3. DUPLICATE PREVENTION:
β Same pair reachable from multiple paths
β (1,1) from (0,1)βdown OR (1,0)βright
β Use visited set to track generated pairs
4. INITIALIZATION STRATEGY:
β Start with first row (k positions)
β Spreads initial candidates
β Faster convergence to k pairs
5. PATTERN RECOGNITION:
β Similar to merge k sorted lists
β 2D matrix, pick smallest, expand
β Common in multi-source merge problems
INTERVIEW STRATEGY:
Progressive Presentation:
"I see two approaches:
Approach 1: Generate all mΓn pairs, sort, take k
Time: O(mΓn log(mΓn))
Problem: When m,n=10^5, generates 10^10 pairs!
Approach 2: Incremental generation with min-heap
Key insight: Arrays sorted β smallest pairs near (0,0)
Strategy:
- Start with first row candidates
- Use min-heap to get smallest
- Add right and down neighbors
- Track visited to prevent duplicates
Time: O(k log k)
Much better! Only generates pairs as needed.
Let me implement the optimal solution."
Shows:
β Awareness of brute force impossibility
β Understanding of sorted property
β Incremental generation strategy
β Complexity analysis
β Optimal solution knowledge
This is REAL mastery! π±βπ³βπ
β οΈ Common Mistakes
Mistake 1: Generating all pairs
// β WRONG - Can generate 10^10 pairs!
for (int i = 0; i < nums1.length; i++) {
for (int j = 0; j < nums2.length; j++) {
// Memory overflow!
}
}
// β CORRECT - Generate incrementally
// Use heap to add candidates as needed
Mistake 2: Not preventing duplicates
// β WRONG - Same pair added multiple times
// (1,1) can come from (0,1) or (1,0)
minHeap.offer(new int[]{i+1, j, sum});
minHeap.offer(new int[]{i, j+1, sum});
// β CORRECT - Use visited set
if (!visited.contains(key)) {
minHeap.offer(new int[]{i, j, sum});
visited.add(key);
}
Mistake 3: Wrong initialization
// β WRONG - Only starting with (0,0)
minHeap.offer(new int[]{0, 0, nums1[0] + nums2[0]});
// Slower to explore space
// β CORRECT - Start with first k in first row
for (int j = 0; j < Math.min(k, nums2.length); j++) {
minHeap.offer(new int[]{0, j, nums1[0] + nums2[j]});
}
Mistake 4: Forgetting bounds check
// β WRONG - Array index out of bounds
minHeap.offer(new int[]{i+1, j, nums1[i+1] + nums2[j]});
// β CORRECT - Check bounds first
if (i + 1 < nums1.length) {
minHeap.offer(new int[]{i+1, j, nums1[i+1] + nums2[j]});
}
π― Pattern Recognition
Problem Type: K Smallest from Multiple Sources
Core Pattern: Min-Heap + Incremental Generation
When to Apply:
β Find k smallest/largest from combinations
β Multiple sorted sources to merge
β Can't afford to generate all combinations
β Combinations have ordering property
Recognition Keywords:
- "k smallest/largest"
- "sorted arrays"
- "pairs/combinations"
- "smallest sums"
Similar Problems:
- Merge K Sorted Lists (LC 23)
- Kth Smallest Element in Sorted Matrix (LC 378)
- Ugly Number II (LC 264) - generates incrementally
- Find K-th Smallest Pair Distance (LC 719)
Key Components:
ββββββββββββββββββββββββββββββββββββββββββββββ
β Min-Heap: Track smallest candidates β
β Visited Set: Prevent duplicates β
β Smart Init: Start with good candidates β
β Incremental: Add neighbors as needed β
β Sorted Property: Guide generation β
ββββββββββββββββββββββββββββββββββββββββββββββ
π Quick Revision Notes
π― Core Concept:
Find k pairs with smallest sums from two sorted arrays. Brute force: Generate all mΓn pairs O(mΓn log(mΓn)) - impractical! Optimal: Min-heap with incremental generation O(k log k). Start with first row, use heap to get smallest, add right/down neighbors, track visited to prevent duplicates. Uses sorted property!
β‘ Quick Implementation:
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.PriorityQueue;
public class Solution {
public List<List<Integer>> kSmallestPairs(int[] a, int[] b, int k) {
// return maxHeap(a, b, k); // getting TLE as its O(N2)
return maxHeapOptimal(a, b, k);
}
private List<List<Integer>> maxHeapOptimal(int[] a, int[] b, int k) {
List<List<Integer>> res = new ArrayList<>();
// step 1: max heap that returns elements by sum ascending
PriorityQueue<int[]> pq = new PriorityQueue<>((a1, a2) -> Integer.compare(a1[2], a2[2]));
// step 2: visited set (not to revisit already processed elements
// or existing on heap to get processed)
// why visited? read step 4 and come here
// suppose we process (0,0) and pushed (0,1), (1,0) to PQ
// Now we process (0,1) => which pushes (1,1) and (0,2)
// At some point, if we process (1,0) => again (1,1) comes to picture
HashSet<String> visited = new HashSet<>();
// step 3: lets push (0, 0) elemnt to queue (since it would
// be the smallest of all future generated pairs)
int i = 0;
int j = 0;
pq.offer(new int[] { i, j, a[i] + b[j] });
visited.add(i + "," + j);
// step 4: after (0, 0), next smaller element can be either (0, 1) or (1, 0)
// which is (i, j+1) or (i+1, j). In other words, once you consider an element
// at (i, j) to be present in the result, next element to be pushed to the
// result would be either at (i, j+1) or (i+1, j). Same continues.
// But, you need to push both to PQ. Else (i+1, j) would be missed forever per
// the below example.
// If we push (i, j+1), next would be either (i, j+2) or (i+1, j+1). So, add
// other one also as heap anyways give min element
while (k > 0 && !pq.isEmpty()) {
// step 5: poll the current min element and add to result
int[] polled = pq.poll();
i = polled[0];
j = polled[1];
res.add(List.of(a[i], b[j]));
k--;
// step 6: do for the next element
if (j + 1 < b.length && !visited.contains(i + "," + (j + 1))) {
pq.offer(new int[] { i, j + 1, a[i] + b[j + 1] });
visited.add(i + "," + (j + 1));
}
if (i + 1 < a.length && !visited.contains((i + 1) + "," + j)) {
pq.offer(new int[] { i + 1, j, a[i + 1] + b[j] });
visited.add((i + 1) + "," + j);
}
}
return res;
}
private List<List<Integer>> maxHeap(int[] a, int[] b, int k) {
List<List<Integer>> res = new ArrayList<>();
PriorityQueue<int[]> pq = new PriorityQueue<>((a1, a2) -> Integer.compare(a2[2], a1[2]));
for (int i = 0; i < a.length; i++) {
for (int j = 0; j < b.length; j++) {
pq.offer(new int[] { a[i], b[j], a[i] + b[j] });
if (pq.size() > k) {
pq.poll();
}
}
}
for (int i = 0; i < k; i++) {
int[] polled = pq.poll();
res.add(List.of(polled[0], polled[1]));
}
return res;
}
public static void main(String[] args) {
Solution s = new Solution();
// [[1,2],[1,4],[1,6]]
System.out.println(s.kSmallestPairs(new int[] { 1, 7, 11 }, new int[] { 2, 4, 6 }, 3));
// [[1,1],[1,1]]
System.out.println(s.kSmallestPairs(new int[] { 1, 1, 2 }, new int[] { 1, 2, 3 }, 2));
}
}
π Key Insights:
- Why Incremental: Can't generate 10^10 pairs!
- Why Right/Down: Sorted arrays β neighbors larger
- Why Visited: Same pair from multiple paths
- Start First Row: Spreads initial candidates efficiently
- 2D Matrix View: Helps visualize generation
- Complexity: O(k log k) vs O(mΓn log(mΓn))
- Space: O(k) vs O(mΓn) - HUGE difference! β
πͺ Memory Aid:
"Start first row, heap tracks smallest!"
"Add right and down, track visited!"
"Generate as needed - don't build all!" β¨
π§ͺ Edge Cases
Case 1: k larger than total pairs
Input: nums1=[1,2], nums2=[3], k=3
Total pairs: 2
Output: [[1,3],[2,3]]
Return all available pairs
Case 2: Empty array
Input: nums1=[], nums2=[1,2], k=2
Output: []
Case 3: k = 1
Input: nums1=[1,7], nums2=[2,4], k=1
Output: [[1,2]]
Smallest pair only
All handled correctly! β
π Complexity Analysis
Approach 1: Brute Force
Time: O(mΓn log(mΓn))
Generate all: O(mΓn)
Sort: O(mΓn log(mΓn))
Space: O(mΓn)
Store all pairs
Impractical for large m,n!
Approach 2: Optimal
Time: O(k log k)
k iterations
Each: poll O(log k) + offer O(log k)
Space: O(k)
Heap: at most k elements
Visited: at most 3k entries
Much better! Only generates needed pairs!
Happy practicing! π―
Note: This problem is BRILLIANT for learning incremental generation! You can't generate all 10^10 pairs, so you must be smart. The key insight: use sorted property to generate pairs incrementally. Start with smallest candidates, use min-heap to track, add neighbors as you go. The visualization as a 2D matrix where you explore right and down is BEAUTIFUL! Understanding WHY we add only right/down neighbors (sorted property), WHY we need visited set (multiple paths), and WHY we start with first row (efficiency) builds deep algorithmic thinking. This pattern appears in many "k smallest from multiple sources" problems! True mastery! πͺβ¨π