241. Longest Common Subsequence
🔗 LeetCode Problem: 1143. Longest Common Subsequence
📊 Difficulty: Medium
🏷️ Topics: Dynamic Programming, String, DP - Strings
Problem Statement
Given two strings text1 and text2, return the length of their longest common subsequence. If there is no common subsequence, return 0.
A subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.
For example, "ace" is a subsequence of "abcde".
A common subsequence of two strings is a subsequence that is common to both strings.
Example 1:
Input: text1 = "abcde", text2 = "ace"
Output: 3
Explanation: The longest common subsequence is "ace" and its length is 3.
Example 2:
Input: text1 = "abc", text2 = "abc"
Output: 3
Explanation: The longest common subsequence is "abc" and its length is 3.
Example 3:
Input: text1 = "abc", text2 = "def"
Output: 0
Explanation: There is no such common subsequence, so the result is 0.
Constraints:
- 1 <= text1.length, text2.length <= 1000
- text1 and text2 consist of only lowercase English characters.
🧒 Understanding from Absolute First Principles
What IS a Subsequence? (Really Understanding It)
Let's build this concept from scratch:
Given string: "abcde"
Question: What strings can I make by ONLY deleting characters?
Delete nothing: "abcde" ✓
Delete 'b': "acde" ✓
Delete 'b' and 'd': "ace" ✓
Delete 'a', 'c', 'e': "bd" ✓
Delete everything: "" ✓
Key rules I notice:
1. I can SKIP characters
2. I CANNOT rearrange
3. I CANNOT add new characters
4. Order is PRESERVED
Example: Can I get "aec" from "abcde"?
"abcde" → a_ec__ ?
'e' comes before 'c' in original? Yes!
So "aec" is possible ✓
Example: Can I get "ade" from "abcde"?
"abcde" → a_d_e_?
Yes! Skip b and c ✓
Example: Can I get "dba" from "abcde"?
"abcde" → d_b_a_?
Wait... 'd' comes AFTER 'b' and 'a' in original
I can't rearrange! ✗
CORE INSIGHT: Subsequence = Highlighting with no reordering! 🔑
What IS a Common Subsequence?
Two strings: text1 = "abcde", text2 = "ace"
Question: What subsequences appear in BOTH?
From text1 = "abcde":
Subsequences: "a", "b", "c", "d", "e", "ab", "ac", "ad", "ae",
"bc", "bd", "be", "cd", "ce", "de", "abc", "abd",
"abe", "acd", "ace", "ade", "bcd", "bce", "bde",
"cde", "abcd", "abce", "abde", "acde", "bcde", "abcde"
From text2 = "ace":
Subsequences: "a", "c", "e", "ac", "ae", "ce", "ace"
Common to BOTH:
"a" ✓ (in both)
"c" ✓ (in both)
"e" ✓ (in both)
"ac" ✓ (in both)
"ae" ✓ (in both)
"ce" ✓ (in both)
"ace" ✓ (in both)
Longest common subsequence: "ace" (length 3) ✓
INSIGHT: We're finding what's SHARED between strings! 🎯
🤔 Why Is This Problem Not Trivial?
The Naive Approach - Why It Fails
Attempt 1: Just find common characters?
text1 = "abcde"
text2 = "ace"
Common characters: 'a', 'c', 'e'
Answer: "ace" ✓
Wait, that worked! Is it always this simple?
Let me try another:
text1 = "abc"
text2 = "cba"
Common characters: 'a', 'b', 'c'
Can I form "abc" as subsequence of "cba"?
"cba" → a_b_c_?
'a' comes after 'b' and 'c' in "cba"! ✗
Can I form "cba" as subsequence of "abc"?
"abc" → c_b_a_?
'c' comes after 'a' and 'b' in "abc"! ✗
What about just "a"? Yes! ✓
What about just "b"? Yes! ✓
What about just "c"? Yes! ✓
What about "ab"?
In text1: yes
In text2 "cba": c[ba] yes! ✓
What about "bc"?
In text1: yes
In text2 "cba": [cb]a yes! ✓
Longest: "ab" or "bc" (both length 2)
INSIGHT: ORDER MATTERS! Can't just look at common characters! 🔑
Why This Problem Is Hard
The challenge: EXPONENTIAL possibilities!
For text1 of length m:
Number of subsequences = 2^m
For text2 of length n:
Number of subsequences = 2^n
Brute force:
Generate all subsequences of text1: O(2^m)
For each, check if it's a subsequence of text2: O(n)
Total: O(2^m × n) ❌
For m=n=1000:
2^1000 ≈ 10^301 operations
More than atoms in the universe! ❌
We need a SMARTER approach! 🎯
💡 Building Intuition - The Recursive Insight
Thinking Recursively
Let me think about comparing two strings CHARACTER BY CHARACTER:
text1 = "abcde"
text2 = "ace"
Start comparing from the beginning:
Position 0:
text1[0] = 'a'
text2[0] = 'a'
They MATCH! ✓
When characters match, what does this mean?
INSIGHT: If first characters match, they can BOTH be in LCS!
LCS("abcde", "ace") = 'a' + LCS("bcde", "ce")
↑ matched! ↑ remaining strings
This is RECURSIVE! 🎯
Let me continue:
LCS("bcde", "ce"):
text1[0] = 'b'
text2[0] = 'c'
They DON'T match! ✗
When characters don't match, what do we do?
INSIGHT: At least ONE of them won't be in the LCS!
Two possibilities:
Option 1: Skip text1[0], keep text2[0]
LCS("cde", "ce")
Option 2: Keep text1[0], skip text2[0]
LCS("bcde", "e")
We don't know which is better, so try BOTH!
Take the LONGER result!
This is the RECURSIVE STRUCTURE! ✓
The Recursive Definition
Let's formalize:
LCS(i, j) = LCS of text1[i...] and text2[j...]
Base Cases:
If i == len(text1): no characters left in text1 → LCS = 0
If j == len(text2): no characters left in text2 → LCS = 0
Recursive Cases:
Case 1: text1[i] == text2[j] (characters MATCH)
Both can be in LCS!
LCS(i, j) = 1 + LCS(i+1, j+1)
Why +1? We include this matching character!
Why i+1, j+1? Move to next in BOTH strings!
Case 2: text1[i] != text2[j] (characters DON'T match)
At least one must be skipped!
Option A: Skip text1[i]
LCS(i+1, j)
Option B: Skip text2[j]
LCS(i, j+1)
LCS(i, j) = max(Option A, Option B)
Why max? We want the LONGEST!
This is BEAUTIFUL! 🌟
🎯 Why Does the Recursive Logic Work?
Mathematical Proof (By Induction)
THEOREM: The recursive definition correctly computes LCS.
PROOF (by strong induction on remaining string lengths):
Base Case: When i == m or j == n
One string is empty
No common subsequence possible
LCS = 0 ✓
Inductive Hypothesis:
Assume the formula works for all strings shorter than text1[i...]
and text2[j...]
Inductive Step:
Case 1: text1[i] == text2[j]
Claim: LCS(i, j) = 1 + LCS(i+1, j+1)
Proof:
Let L be the optimal LCS of text1[i...] and text2[j...]
Sub-case A: text1[i] is IN L and text2[j] is IN L
Since text1[i] == text2[j], they're the same character!
L = [this char] + [LCS of remaining]
|L| = 1 + |LCS(i+1, j+1)|
By inductive hypothesis, LCS(i+1, j+1) is correct
So LCS(i, j) = 1 + LCS(i+1, j+1) ✓
Sub-case B: text1[i] is NOT in L
Then L is a subsequence of text1[i+1...]
|L| ≤ |LCS(i+1, j)|
But wait! We can ADD text1[i] = text2[j] to L!
New length = 1 + |L| > |L|
Contradiction! (L was supposed to be optimal)
So this case is impossible! ✗
Sub-case C: text2[j] is NOT in L
Similar contradiction as Sub-case B
Impossible! ✗
Therefore: When chars match, formula is correct! ✓
Case 2: text1[i] != text2[j]
Claim: LCS(i, j) = max(LCS(i+1, j), LCS(i, j+1))
Proof:
Let L be the optimal LCS
Since text1[i] != text2[j], they can't BOTH be in L!
Sub-case A: text1[i] is NOT in L
L is subsequence of text1[i+1...] and text2[j...]
|L| ≤ |LCS(i+1, j)|
By inductive hypothesis, LCS(i+1, j) is correct
Sub-case B: text2[j] is NOT in L
L is subsequence of text1[i...] and text2[j+1...]
|L| ≤ |LCS(i, j+1)|
By inductive hypothesis, LCS(i, j+1) is correct
At least ONE sub-case must be true!
So |L| ≤ max(LCS(i+1, j), LCS(i, j+1))
Conversely, max(LCS(i+1, j), LCS(i, j+1)) is a valid LCS
So |L| ≥ max(LCS(i+1, j), LCS(i, j+1))
Therefore: |L| = max(LCS(i+1, j), LCS(i, j+1)) ✓
By induction, the formula is correct for all cases! QED ✓
🔴 Approach 1: Pure Recursion
The Implementation
class Solution {
public int longestCommonSubsequence(String text1, String text2) {
return lcs(text1, text2, 0, 0);
}
private int lcs(String text1, String text2, int i, int j) {
// Base cases
if (i == text1.length() || j == text2.length()) {
return 0;
}
// Case 1: Characters match
if (text1.charAt(i) == text2.charAt(j)) {
return 1 + lcs(text1, text2, i+1, j+1);
}
// Case 2: Characters don't match
int skipText1 = lcs(text1, text2, i+1, j);
int skipText2 = lcs(text1, text2, i, j+1);
return Math.max(skipText1, skipText2);
}
}
Why This Is Exponential
TIME COMPLEXITY ANALYSIS:
Let T(m, n) = time to compute LCS of strings of length m and n
Recurrence:
T(m, n) = T(m-1, n) + T(m, n-1) + O(1)
This is similar to Fibonacci!
But in 2D...
Recursion tree:
(0,0)
/ \
(1,0) (0,1)
/ \ / \
(2,0)(1,1)(1,1)(0,2)
↑ ↑
Same state called TWICE!
Number of unique states: m × n
Each state can be reached MANY times!
Worst case: O(2^(m+n))
For m=n=20:
2^40 ≈ 1 trillion! ❌
OBSERVATION: Overlapping subproblems!
LCS(1, 1) computed multiple times
LCS(2, 3) computed multiple times
This SCREAMS dynamic programming! 🔑
🟡 Approach 2: Memoization (Top-Down DP)
Why Memoization Works
INSIGHT: Same (i, j) computed many times!
Solution: Remember the result!
memo[i][j] = LCS length for text1[i...] and text2[j...]
First time computing LCS(i, j):
1. Check memo[i][j]
2. If not computed (= -1), compute it
3. Store in memo[i][j]
4. Return result
Next time computing LCS(i, j):
1. Check memo[i][j]
2. Already computed! Return immediately!
NUMBER OF UNIQUE STATES:
i: 0 to m
j: 0 to n
Total: m × n states
Each state computed ONCE!
Time per state: O(1)
TOTAL TIME: O(m × n) ✓
HUGE IMPROVEMENT: 2^40 → 400 for m=n=20! 🚀
Implementation
class Solution {
private int[][] memo;
public int longestCommonSubsequence(String text1, String text2) {
int m = text1.length();
int n = text2.length();
// Initialize memoization table
memo = new int[m][n];
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
memo[i][j] = -1; // -1 means not computed
}
}
return lcs(text1, text2, 0, 0);
}
private int lcs(String text1, String text2, int i, int j) {
// Base cases
if (i == text1.length() || j == text2.length()) {
return 0;
}
// Check if already computed
if (memo[i][j] != -1) {
return memo[i][j];
}
// Compute result
int result;
if (text1.charAt(i) == text2.charAt(j)) {
result = 1 + lcs(text1, text2, i+1, j+1);
} else {
int skip1 = lcs(text1, text2, i+1, j);
int skip2 = lcs(text1, text2, i, j+1);
result = Math.max(skip1, skip2);
}
// Store and return
memo[i][j] = result;
return result;
}
}
🟢 Approach 3: Bottom-Up DP
Transitioning from Recursion to Iteration
OBSERVATION:
Recursion: Start from (0, 0), recurse to larger i, j
Top-down approach
Bottom-up: Compute larger i, j first, build backwards
Start from base cases, build up to answer
Wait... that doesn't sound right for this problem!
Let me reconsider the state definition:
OPTION A (matching recursion):
dp[i][j] = LCS of text1[i...end] and text2[j...end]
Dependencies: dp[i][j] needs dp[i+1][...] and dp[...][j+1]
Larger indices needed first!
Must iterate BACKWARDS!
OPTION B (more natural for bottom-up):
dp[i][j] = LCS of text1[0...i-1] and text2[0...j-1]
Dependencies: dp[i][j] needs dp[i-1][...] and dp[...][j-1]
Smaller indices needed first!
Can iterate FORWARDS!
I'll use OPTION B for cleaner iteration! ✓
Understanding the New State Definition
dp[i][j] = LCS length of first i chars of text1 and first j chars of text2
Example: text1 = "abcde", text2 = "ace"
dp[0][0] = LCS of "" and "" = 0
dp[1][1] = LCS of "a" and "a" = 1
dp[3][2] = LCS of "abc" and "ac" = 2
dp[5][3] = LCS of "abcde" and "ace" = 3 (our answer!)
This is more NATURAL for bottom-up! ✓
Base cases:
dp[0][j] = 0 (empty text1, no LCS)
dp[i][0] = 0 (empty text2, no LCS)
Recurrence (adjusted for 0-indexing):
If text1[i-1] == text2[j-1]:
dp[i][j] = 1 + dp[i-1][j-1]
Why i-1? Because dp[i] represents first i characters,
so last character is at index i-1!
Else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
Same logic, just different indexing! ✓
Why Forward Loops Work
DEPENDENCY ANALYSIS:
dp[i][j] depends on:
- dp[i-1][j-1] (diagonal up-left)
- dp[i-1][j] (directly above)
- dp[i][j-1] (directly left)
Visual:
j-1 j
i-1 ↓ ↓
↘
i → [i,j]
All arrows point UP and LEFT!
For dp[i][j] to be computed:
- Need i-1 computed first (smaller i)
- Need j-1 computed first (smaller j)
Therefore:
- i must go FORWARD: 0 → m
- j must go FORWARD: 0 → n
LOOP PATTERN:
for (int i = 1; i <= m; i++) { // Forward ✓
for (int j = 1; j <= n; j++) { // Forward ✓
// All dependencies already computed!
}
}
This is the STANDARD 2D DP pattern! ✓
Complete Implementation
class Solution {
public int longestCommonSubsequence(String text1, String text2) {
int m = text1.length();
int n = text2.length();
// dp[i][j] = LCS of first i chars of text1, first j chars of text2
int[][] dp = new int[m + 1][n + 1];
// Base cases: dp[0][j] = 0, dp[i][0] = 0 (already 0 by default)
// Fill the table
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (text1.charAt(i-1) == text2.charAt(j-1)) {
// Characters match
dp[i][j] = 1 + dp[i-1][j-1];
} else {
// Characters don't match
dp[i][j] = Math.max(dp[i-1][j], dp[i][j-1]);
}
}
}
return dp[m][n];
}
}
🔍 Deep Dive: Why the DP Recurrence Is Correct
Proof of Correctness for Bottom-Up
CLAIM: dp[i][j] correctly computes LCS of first i chars and first j chars
PROOF (by induction on i + j):
Base Case: i = 0 or j = 0
One string is empty
LCS = 0
dp[0][j] = 0, dp[i][0] = 0 ✓
Inductive Hypothesis:
Assume dp[i'][j'] is correct for all i' + j' < i + j
Inductive Step:
Case 1: text1[i-1] == text2[j-1]
Last characters of both substrings match!
Claim: dp[i][j] = 1 + dp[i-1][j-1]
Proof:
Let L* be the optimal LCS of text1[0...i-1] and text2[0...j-1]
Two scenarios:
Scenario A: text1[i-1] is in L* and text2[j-1] is in L*
Since text1[i-1] == text2[j-1], this is the SAME character
It must be at the END of L* (to maintain order)
L* = [LCS of text1[0...i-2] and text2[0...j-2]] + [text1[i-1]]
|L*| = dp[i-1][j-1] + 1
By inductive hypothesis, dp[i-1][j-1] is correct
So dp[i][j] = 1 + dp[i-1][j-1] ✓
Scenario B: text1[i-1] is NOT in L*
L* is subsequence of text1[0...i-2] and text2[0...j-1]
|L*| ≤ dp[i-1][j]
But wait! We can ADD text1[i-1] = text2[j-1] to L*!
Since text2[j-1] exists, we can match it with text1[i-1]
New LCS length = |L*| + 1 > |L*|
Contradiction! (L* was supposed to be optimal)
This scenario is impossible! ✗
Scenario C: text2[j-1] is NOT in L*
Similar contradiction
Impossible! ✗
Therefore: dp[i][j] = 1 + dp[i-1][j-1] is correct! ✓
Case 2: text1[i-1] != text2[j-1]
Last characters are different!
Claim: dp[i][j] = max(dp[i-1][j], dp[i][j-1])
Proof:
Let L* be the optimal LCS
Since text1[i-1] != text2[j-1], they can't BOTH be in L*
Scenario A: text1[i-1] is NOT in L*
L* is subsequence of text1[0...i-2] and text2[0...j-1]
|L*| ≤ dp[i-1][j]
By inductive hypothesis, dp[i-1][j] is correct
Scenario B: text2[j-1] is NOT in L*
L* is subsequence of text1[0...i-1] and text2[0...j-2]
|L*| ≤ dp[i][j-1]
By inductive hypothesis, dp[i][j-1] is correct
At least ONE scenario must be true!
So |L*| ≤ max(dp[i-1][j], dp[i][j-1])
Also, max(dp[i-1][j], dp[i][j-1]) is a valid LCS length
So |L*| ≥ max(dp[i-1][j], dp[i][j-1])
Therefore: |L*| = max(dp[i-1][j], dp[i][j-1]) = dp[i][j] ✓
By induction, dp[i][j] is correct for all i, j! QED ✓
🎨 Visual Example: Building the DP Table
Step-by-Step Construction
text1 = "abcde", text2 = "ace"
m = 5, n = 3
Initialize:
"" a c e
"" 0 0 0 0
a 0 ? ? ?
b 0 ? ? ?
c 0 ? ? ?
d 0 ? ? ?
e 0 ? ? ?
Fill row by row, left to right:
i=1 (text1[0]='a'):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
j=1: text1[0]='a', text2[0]='a' → match! ✓
dp[1][1] = 1 + dp[0][0] = 1 + 0 = 1
j=2: text1[0]='a', text2[1]='c' → different
dp[1][2] = max(dp[0][2]=0, dp[1][1]=1) = 1
j=3: text1[0]='a', text2[2]='e' → different
dp[1][3] = max(dp[0][3]=0, dp[1][2]=1) = 1
"" a c e
"" 0 0 0 0
a 0 1 1 1
b 0 ? ? ?
c 0 ? ? ?
d 0 ? ? ?
e 0 ? ? ?
i=2 (text1[1]='b'):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
j=1: text1[1]='b', text2[0]='a' → different
dp[2][1] = max(dp[1][1]=1, dp[2][0]=0) = 1
j=2: text1[1]='b', text2[1]='c' → different
dp[2][2] = max(dp[1][2]=1, dp[2][1]=1) = 1
j=3: text1[1]='b', text2[2]='e' → different
dp[2][3] = max(dp[1][3]=1, dp[2][2]=1) = 1
"" a c e
"" 0 0 0 0
a 0 1 1 1
b 0 1 1 1
c 0 ? ? ?
d 0 ? ? ?
e 0 ? ? ?
i=3 (text1[2]='c'):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
j=1: text1[2]='c', text2[0]='a' → different
dp[3][1] = max(dp[2][1]=1, dp[3][0]=0) = 1
j=2: text1[2]='c', text2[1]='c' → match! ✓
dp[3][2] = 1 + dp[2][1] = 1 + 1 = 2
j=3: text1[2]='c', text2[2]='e' → different
dp[3][3] = max(dp[2][3]=1, dp[3][2]=2) = 2
"" a c e
"" 0 0 0 0
a 0 1 1 1
b 0 1 1 1
c 0 1 2 2
d 0 ? ? ?
e 0 ? ? ?
i=4 (text1[3]='d'):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
j=1: text1[3]='d', text2[0]='a' → different
dp[4][1] = max(dp[3][1]=1, dp[4][0]=0) = 1
j=2: text1[3]='d', text2[1]='c' → different
dp[4][2] = max(dp[3][2]=2, dp[4][1]=1) = 2
j=3: text1[3]='d', text2[2]='e' → different
dp[4][3] = max(dp[3][3]=2, dp[4][2]=2) = 2
"" a c e
"" 0 0 0 0
a 0 1 1 1
b 0 1 1 1
c 0 1 2 2
d 0 1 2 2
e 0 ? ? ?
i=5 (text1[4]='e'):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
j=1: text1[4]='e', text2[0]='a' → different
dp[5][1] = max(dp[4][1]=1, dp[5][0]=0) = 1
j=2: text1[4]='e', text2[1]='c' → different
dp[5][2] = max(dp[4][2]=2, dp[5][1]=1) = 2
j=3: text1[4]='e', text2[2]='e' → match! ✓
dp[5][3] = 1 + dp[4][2] = 1 + 2 = 3
"" a c e
"" 0 0 0 0
a 0 1 1 1
b 0 1 1 1
c 0 1 2 2
d 0 1 2 2
e 0 1 2 3
ANSWER: dp[5][3] = 3 ✓
The LCS is "ace" with length 3!
Understanding the Pattern
OBSERVATIONS from the table:
1. Each row represents adding one character from text1
2. Each column represents adding one character from text2
3. Values never DECREASE going right or down
(LCS length can only increase or stay same)
4. When characters match: diagonal + 1
5. When they don't match: max(top, left)
This pattern appears in ALL LCS problems! 🔑
📊 Complexity Analysis
Time Complexity
THREE APPROACHES:
1. Pure Recursion:
Worst case: O(2^(m+n))
Why? Each call branches into 2, exponential tree
2. Memoization:
States: m × n
Time per state: O(1)
Total: O(m × n) ✓
3. Bottom-Up DP:
Two nested loops: i from 1 to m, j from 1 to n
Each iteration: O(1) work
Total: O(m × n) ✓
Both DP approaches: O(m × n) ✓
Space Complexity
1. Pure Recursion:
Call stack depth: O(m + n)
Why? Worst case, we recurse until both strings empty
2. Memoization:
Memo table: O(m × n)
Call stack: O(m + n)
Total: O(m × n) ✓
3. Bottom-Up DP:
DP table: O(m × n)
No recursion: O(1) extra
Total: O(m × n) ✓
Can we optimize space?
OPTIMIZATION: Notice dp[i][j] only needs:
- Current row
- Previous row
Space-optimized: O(min(m, n))
Keep only 2 rows instead of entire table!
But this loses ability to reconstruct actual LCS string!
Trade-off between space and information! ✓
🎯 The Deeper Pattern - Why LCS Is Fundamental
LCS as a Building Block
LCS is used as a SUBROUTINE in many problems:
Problem 239 (Delete Operations):
deletions = m + n - 2×LCS
Problem 240 (Shortest Supersequence):
length = m + n - LCS
Uses LCS table to build result
Edit Distance:
Measures transformations needed
Related to LCS (different operations)
Diff Tools:
Git diff, file comparison
Based on LCS algorithm!
DNA Sequence Alignment:
Find similar regions
Core: LCS with scoring
LCS is a FUNDAMENTAL algorithm! 🔑
Why Is This Pattern So Common?
THEORETICAL FOUNDATION:
LCS represents the MAXIMUM OVERLAP between sequences
In mathematics:
Overlap = Intersection
LCS = Sequence Intersection
Applications arise whenever we ask:
"What do these sequences have in COMMON?"
Examples:
- Version control (code lines)
- Bioinformatics (DNA bases)
- Data compression (repeated patterns)
- Plagiarism detection (common text)
Understanding LCS deeply = Understanding sequence comparison! ✓
💡 Key Insights Summary
The Mathematical Beauty
1. WHY recursive structure works:
Optimal substructure property
If characters match → both in optimal solution
If they don't → at least one must be excluded
2. WHY overlapping subproblems exist:
Same (i, j) states reached via different paths
Exponential tree collapses to polynomial table
3. WHY forward loops work:
Dependencies point to smaller indices
Compute in increasing order
4. WHY this is a pattern:
Many problems reduce to sequence comparison
LCS captures the essence of overlap
📝 Quick Revision
🎯 Core Definition
LCS: Longest sequence appearing in both strings (in order)
Not: Common characters (order matters!)
⚡ The Recurrence
public class Solution {
public int longestCommonSubsequence(String s1, String s2) {
// int lcs = recursive(s1, s2, 0, 0);
// Integer[][] memo = new Integer[s1.length() + 1][s2.length() + 1];
// int lcs = topDown(s1, s2, 0, 0, memo);
int lcs = bottomUp(s1, s2);
return lcs == Integer.MIN_VALUE ? 0 : lcs;
}
private int bottomUp(String s1, String s2) {
int len1 = s1.length();
int len2 = s2.length();
int[][] dp = new int[len1 + 1][len2 + 1];
for (int i = 1; i <= len1; i++) {
for (int j = 1; j <= len2; j++) {
if (s1.charAt(i - 1) == s2.charAt(j - 1)) {
dp[i][j] = 1 + dp[i - 1][j - 1];
} else {
// step 2: skip char at i
int option1 = dp[i - 1][j];
// step 3: skip char at j
int option2 = dp[i][j - 1];
dp[i][j] = Math.max(option1, option2);
}
}
}
return dp[len1][len2];
}
private int topDown(String s1, String s2, int i, int j, Integer[][] memo) {
// step 4a: base case 1
if (i == s1.length() && j == s2.length()) {
return 0;
}
// step 4b: base case 2
if (i >= s1.length() || j >= s2.length()) {
return Integer.MIN_VALUE;
}
if (memo[i][j] != null) {
return memo[i][j];
}
// step 1: current chars matched. Proceed with next chars
if (s1.charAt(i) == s2.charAt(j)) {
int temp = topDown(s1, s2, i + 1, j + 1, memo);
if (temp == Integer.MIN_VALUE) {
temp = 0;
}
return 1 + temp;
}
// step 2: skip char at i
int option1 = topDown(s1, s2, i + 1, j, memo);
// step 3: skip char at j
int option2 = topDown(s1, s2, i, j + 1, memo);
return memo[i][j] = Math.max(option1, option2);
}
private int recursive(String s1, String s2, int i, int j) {
// step 4a: base case 1
if (i == s1.length() && j == s2.length()) {
return 0;
}
// step 4b: base case 2
if (i >= s1.length() || j >= s2.length()) {
return Integer.MIN_VALUE;
}
// step 1: current chars matched. Proceed with next chars
if (s1.charAt(i) == s2.charAt(j)) {
int temp = recursive(s1, s2, i + 1, j + 1);
if (temp == Integer.MIN_VALUE) {
temp = 0;
}
return 1 + temp;
}
// step 2: skip char at i
int option1 = recursive(s1, s2, i + 1, j);
// step 3: skip char at j
int option2 = recursive(s1, s2, i, j + 1);
return Math.max(option1, option2);
}
public static void main(String[] args) {
Solution s = new Solution();
System.out.println(s.longestCommonSubsequence("abcde", "ace") == 3);
System.out.println(s.longestCommonSubsequence("abc", "abc") == 3);
System.out.println(s.longestCommonSubsequence("abc", "def") == 0);
}
}
🔑 Why It Works
- Match: Must include both (proof by contradiction)
- No match: Skip one, try both options
- Forward loops: Dependencies on i-1, j-1
Complexity: O(m×n) time and space
🎉 Congratulations!
You've mastered LCS through: - ✅ First principles understanding - ✅ Mathematical proofs (induction) - ✅ Recursive insight discovery - ✅ Complete correctness proof - ✅ Pattern recognition across problems
This is the FOUNDATION for string DP! 🚀✨