You are given two strings word1
and word2
. In one operation, you can delete exactly one character from either string. Return the minimum number of deletions required so that the two strings become the same.
Key idea: if LCS(word1, word2)
is the length of their Longest Common Subsequence, then
minimum deletions = len(word1) + len(word2) − 2 × LCS(word1, word2).
Everything outside the LCS must be deleted from one side or the other.
Here’s the [Problem Link] to begin with.
Example 1
word1 = "sea", word2 = "eat"
LCS = "ea" with length 2
Answer = 3 + 3 − 2×2 = 2
Example 2
word1 = "leetcode", word2 = "etco"
LCS length = 4
Answer = 8 + 4 − 2×4 = 4
Below are three solutions. We explain the problem only once at the top, then for each solution provide Intuition and Approach, Code Implementation, Code explanation, Time and Space Complexity, and a short Conclusion.
Memoization
Intuition and Approach
Compute LCS using top down recursion with caching.
Let solve(i, j)
be the LCS length of prefixes word1[:i]
and word2[:j]
using 1-based lengths:
- If
i == 0
orj == 0
, result is 0. - If
word1[i-1] == word2[j-1]
, take1 + solve(i-1, j-1)
. - Else take
max(solve(i-1, j), solve(i, j-1))
.
Finally compute deletions as n + m − 2 × lcs
.
Code Implementation
class Solution:
def solve(self, text1: str, text2: str, ind1: int, ind2: int, dp) -> int:
if ind1 == 0 or ind2 == 0:
return 0
if dp[ind1][ind2] != -1:
return dp[ind1][ind2]
if text1[ind1 - 1] == text2[ind2 - 1]:
dp[ind1][ind2] = 1 + self.solve(text1, text2, ind1 - 1, ind2 - 1, dp)
else:
dp[ind1][ind2] = 0 + max(
self.solve(text1, text2, ind1 - 1, ind2, dp),
self.solve(text1, text2, ind1, ind2 - 1, dp)
)
return dp[ind1][ind2]
def minDistance(self, word1: str, word2: str) -> int:
n = len(word1)
m = len(word2)
dp = [[-1 for _ in range(m + 1)] for _ in range(n + 1)]
lcs = self.solve(word1, word2, n, m, dp)
return n + m - (2 * lcs)
Code explanation
dp[i][j]
caches LCS length for prefixes of lengthsi
andj
.- Matching characters extend the subsequence from the diagonal state.
- Otherwise, we try skipping one character from either string and take the maximum.
- After computing LCS, use the formula to get the minimum deletions.
Time and Space Complexity
- Precise: states
n*m
, each computed once, time O(n*m); DP table O(n*m) and recursion stack O(n + m). - Simplified: O(n*m) time, O(n*m) space.
Conclusion
Memoization converts exponential recursion into polynomial time and is simple to implement.
Tabulation
Intuition and Approach
Bottom up dynamic programming for LCS. Build a (n+1) × (m+1)
table where row 0 and column 0 are zeros, then apply the standard match or max transition. Answer is n + m − 2 × dp[n][m]
.
Code Implementation
class Solution:
def longestCommonSubsequence(self, text1: str, text2: str) -> int:
n = len(text1)
m = len(text2)
dp = [[-1 for _ in range(m + 1)] for _ in range(n + 1)]
for ind2 in range(0, m + 1):
dp[0][ind2] = 0
for ind1 in range(0, n + 1):
dp[ind1][0] = 0
for ind1 in range(1, n + 1):
for ind2 in range(1, m + 1):
if text1[ind1 - 1] == text2[ind2 - 1]:
dp[ind1][ind2] = 1 + dp[ind1 - 1][ind2 - 1]
else:
dp[ind1][ind2] = 0 + max(dp[ind1 - 1][ind2], dp[ind1][ind2 - 1])
return dp[n][m]
def minDistance(self, word1: str, word2: str) -> int:
n = len(word1)
m = len(word2)
lcs = self.longestCommonSubsequence(word1, word2)
return n + m - (2 * lcs)
Code explanation
- Initialize base cases for empty prefixes.
- For each cell, extend on match or carry forward the best from top or left on mismatch.
- Use the LCS length in the deletion formula.
Time and Space Complexity
- Precise: fill
n*m
cells in O(1) each, time O(n*m); table O(n*m) space. - Simplified: O(n*m) time, O(n*m) space.
Conclusion
Deterministic and stack safe. Also convenient if you later want to reconstruct the LCS.
Tabulation (Space Optimize)
Intuition and Approach
dp[i][j]
depends only on the previous row and the current row’s previous cell. Keep two 1D arrays to reduce space to linear.
Code Implementation
class Solution:
def longestCommonSubsequence(self, text1: str, text2: str) -> int:
n = len(text1)
m = len(text2)
prev = [0] * (m + 1)
for ind1 in range(1, n + 1):
cur = [0] * (m + 1)
for ind2 in range(1, m + 1):
if text1[ind1 - 1] == text2[ind2 - 1]:
cur[ind2] = 1 + prev[ind2 - 1]
else:
cur[ind2] = 0 + max(prev[ind2], cur[ind2 - 1])
prev = cur
return prev[m]
def minDistance(self, word1: str, word2: str) -> int:
n = len(word1)
m = len(word2)
lcs = self.longestCommonSubsequence(word1, word2)
return n + m - (2 * lcs)
Code explanation
prev[j]
holds the previous row’s LCS values andcur[j]
the current row’s values.- Update from
prev[j-1]
,prev[j]
, andcur[j-1]
. - After each row, roll
cur
intoprev
. - Compute deletions using the final LCS value.
Time and Space Complexity
- Precise: still O(n*m) time; arrays of size
m+1
give O(m) extra space. - Simplified: O(n*m) time, O(min(n, m)) space if you iterate columns along the shorter string.
Conclusion
Same optimal time with significantly less memory, ideal when only the count of deletions is needed.
Final takeaway
- The minimum deletions to make two strings equal equals the total length minus twice the LCS length.
- Memoization, tabulation, and space optimized tabulation all achieve O(n*m) time.
- Choose the space optimized method for large inputs and memory efficiency, or tabulation if you might reconstruct the LCS later.
For any changes to the article, kindly email at code@codeanddebug.in or contact us at +91-9712928220.