You are given two strings word1 and word2. In one operation, you can delete exactly one character from either string. Return the minimum number of deletions required so that the two strings become the same.
Key idea: if LCS(word1, word2) is the length of their Longest Common Subsequence, then
minimum deletions = len(word1) + len(word2) − 2 × LCS(word1, word2).
Everything outside the LCS must be deleted from one side or the other.
Here’s the [Problem Link] to begin with.
Example 1
word1 = "sea", word2 = "eat"
LCS = "ea" with length 2
Answer = 3 + 3 − 2×2 = 2Example 2
word1 = "leetcode", word2 = "etco"
LCS length = 4
Answer = 8 + 4 − 2×4 = 4Below are three solutions. We explain the problem only once at the top, then for each solution provide Intuition and Approach, Code Implementation, Code explanation, Time and Space Complexity, and a short Conclusion.
Memoization
Intuition and Approach
Compute LCS using top down recursion with caching.
Let solve(i, j) be the LCS length of prefixes word1[:i] and word2[:j] using 1-based lengths:
- If
i == 0orj == 0, result is 0. - If
word1[i-1] == word2[j-1], take1 + solve(i-1, j-1). - Else take
max(solve(i-1, j), solve(i, j-1)).
Finally compute deletions as n + m − 2 × lcs.
Code Implementation
class Solution:
def solve(self, text1: str, text2: str, ind1: int, ind2: int, dp) -> int:
if ind1 == 0 or ind2 == 0:
return 0
if dp[ind1][ind2] != -1:
return dp[ind1][ind2]
if text1[ind1 - 1] == text2[ind2 - 1]:
dp[ind1][ind2] = 1 + self.solve(text1, text2, ind1 - 1, ind2 - 1, dp)
else:
dp[ind1][ind2] = 0 + max(
self.solve(text1, text2, ind1 - 1, ind2, dp),
self.solve(text1, text2, ind1, ind2 - 1, dp)
)
return dp[ind1][ind2]
def minDistance(self, word1: str, word2: str) -> int:
n = len(word1)
m = len(word2)
dp = [[-1 for _ in range(m + 1)] for _ in range(n + 1)]
lcs = self.solve(word1, word2, n, m, dp)
return n + m - (2 * lcs)Code explanation
dp[i][j]caches LCS length for prefixes of lengthsiandj.- Matching characters extend the subsequence from the diagonal state.
- Otherwise, we try skipping one character from either string and take the maximum.
- After computing LCS, use the formula to get the minimum deletions.
Time and Space Complexity
- Precise: states
n*m, each computed once, time O(n*m); DP table O(n*m) and recursion stack O(n + m). - Simplified: O(n*m) time, O(n*m) space.
Conclusion
Memoization converts exponential recursion into polynomial time and is simple to implement.
Tabulation
Intuition and Approach
Bottom up dynamic programming for LCS. Build a (n+1) × (m+1) table where row 0 and column 0 are zeros, then apply the standard match or max transition. Answer is n + m − 2 × dp[n][m].
Code Implementation
class Solution:
def longestCommonSubsequence(self, text1: str, text2: str) -> int:
n = len(text1)
m = len(text2)
dp = [[-1 for _ in range(m + 1)] for _ in range(n + 1)]
for ind2 in range(0, m + 1):
dp[0][ind2] = 0
for ind1 in range(0, n + 1):
dp[ind1][0] = 0
for ind1 in range(1, n + 1):
for ind2 in range(1, m + 1):
if text1[ind1 - 1] == text2[ind2 - 1]:
dp[ind1][ind2] = 1 + dp[ind1 - 1][ind2 - 1]
else:
dp[ind1][ind2] = 0 + max(dp[ind1 - 1][ind2], dp[ind1][ind2 - 1])
return dp[n][m]
def minDistance(self, word1: str, word2: str) -> int:
n = len(word1)
m = len(word2)
lcs = self.longestCommonSubsequence(word1, word2)
return n + m - (2 * lcs)Code explanation
- Initialize base cases for empty prefixes.
- For each cell, extend on match or carry forward the best from top or left on mismatch.
- Use the LCS length in the deletion formula.
Time and Space Complexity
- Precise: fill
n*mcells in O(1) each, time O(n*m); table O(n*m) space. - Simplified: O(n*m) time, O(n*m) space.
Conclusion
Deterministic and stack safe. Also convenient if you later want to reconstruct the LCS.
Tabulation (Space Optimize)
Intuition and Approach
dp[i][j] depends only on the previous row and the current row’s previous cell. Keep two 1D arrays to reduce space to linear.
Code Implementation
class Solution:
def longestCommonSubsequence(self, text1: str, text2: str) -> int:
n = len(text1)
m = len(text2)
prev = [0] * (m + 1)
for ind1 in range(1, n + 1):
cur = [0] * (m + 1)
for ind2 in range(1, m + 1):
if text1[ind1 - 1] == text2[ind2 - 1]:
cur[ind2] = 1 + prev[ind2 - 1]
else:
cur[ind2] = 0 + max(prev[ind2], cur[ind2 - 1])
prev = cur
return prev[m]
def minDistance(self, word1: str, word2: str) -> int:
n = len(word1)
m = len(word2)
lcs = self.longestCommonSubsequence(word1, word2)
return n + m - (2 * lcs)Code explanation
prev[j]holds the previous row’s LCS values andcur[j]the current row’s values.- Update from
prev[j-1],prev[j], andcur[j-1]. - After each row, roll
curintoprev. - Compute deletions using the final LCS value.
Time and Space Complexity
- Precise: still O(n*m) time; arrays of size
m+1give O(m) extra space. - Simplified: O(n*m) time, O(min(n, m)) space if you iterate columns along the shorter string.
Conclusion
Same optimal time with significantly less memory, ideal when only the count of deletions is needed.
Final takeaway
- The minimum deletions to make two strings equal equals the total length minus twice the LCS length.
- Memoization, tabulation, and space optimized tabulation all achieve O(n*m) time.
- Choose the space optimized method for large inputs and memory efficiency, or tabulation if you might reconstruct the LCS later.
For any changes to the article, kindly email at code@codeanddebug.in or contact us at +91-9712928220.
