Given two strings text1
and text2
, the goal is to print one actual Longest Common Subsequence (LCS), not just its length. An LCS is a sequence that appears in both strings in the same relative order, but not necessarily contiguously. If multiple LCS strings exist, returning any one of them is acceptable.
Example
text1 = "abcde", text2 = "ace"
LCS length = 3, one LCS is "ace"
text1 = "abc", text2 = "def"
LCS length = 0, LCS string is ""
Intuition and Approach
The standard approach has two clear phases.
- Compute the DP table of LCS lengths
Build a tabledp
of size(n+1) x (m+1)
wheredp[i][j]
stores the LCS length for prefixestext1[:i]
andtext2[:j]
.
Transition:- If
text1[i-1] == text2[j-1]
, thendp[i][j] = 1 + dp[i-1][j-1]
. - Otherwise,
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
.
- If
- Backtrack to reconstruct one LCS string
Start fromdp[n][m]
and walk backwards:- If characters match, that character is part of the LCS. Append it and move diagonally to
(i-1, j-1)
. - If they do not match, move to the neighbor cell that has the larger LCS value.
The path you trace yields the LCS in reverse order, so reverse the collected characters at the end.
- If characters match, that character is part of the LCS. Append it and move diagonally to
This method guarantees correctness because the backtrack follows exactly the choices that led to the optimal length stored in the table.
Code Implementation
Your provided implementation, unchanged, with brief comments added only:
class Solution:
def longestCommonSubsequence(self, text1: str, text2: str) -> str:
n, m = len(text1), len(text2)
# 1) Build DP table (lengths)
dp = [[0] * (m + 1) for _ in range(n + 1)]
for i in range(1, n + 1):
for j in range(1, m + 1):
if text1[i - 1] == text2[j - 1]:
dp[i][j] = 1 + dp[i - 1][j - 1]
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
# 2) Backtrack from bottom-right to build one LCS
i, j = n, m
lcs_chars = []
while i > 0 and j > 0:
if text1[i - 1] == text2[j - 1]:
lcs_chars.append(text1[i - 1]) # pick the char
i -= 1
j -= 1
else:
# Move in the direction of the larger value
if dp[i - 1][j] >= dp[i][j - 1]:
i -= 1
else:
j -= 1
# 3) Reverse and join to get the LCS string
return ''.join(reversed(lcs_chars))
Code explanation
- The
dp
table stores LCS lengths for all prefix pairs. The extra row and column at index 0 handle empty prefixes and simplify boundaries. - After filling
dp
, the algorithm starts from(n, m)
and reconstructs a valid LCS by:- Taking matched characters and moving diagonally when
text1[i-1] == text2[j-1]
. - Otherwise stepping toward the neighbor cell that preserved the larger LCS value during the build phase.
- Taking matched characters and moving diagonally when
- Collected characters are in reverse, so the final step reverses them to produce the correct order.
Time and Space Complexity
- Precise
- Building the table visits each cell once: O(n · m) time
- Backtracking touches at most
n + m
cells: O(n + m) time - Total time: O(n · m + n + m) which simplifies to O(n · m)
- Space for the table: O(n · m)
- Simplified
- O(n · m) time and O(n · m) space
If only the length is required, space can be optimized to O(min(n, m)). For printing the actual string using this exact backtrack approach, the full table is convenient and standard.
Conclusion
To print an LCS, first compute the DP length table, then backtrack from the bottom right to collect one valid subsequence. This method is deterministic, easy to implement, and runs in O(n · m) time. It is the canonical solution when you need the actual subsequence rather than just its length.
For any changes to the article, kindly email at code@codeanddebug.in or contact us at +91-9712928220.