The Count Distinct Substrings problem asks us to calculate the number of unique substrings of a given string.
- A substring is a contiguous sequence of characters within a string.
- We must return the total count of distinct substrings, including the empty substring.
Examples
Example 1
Input: s = "ababa"
Output: 10
Explanation: Distinct substrings are
"a", "b", "ab", "ba", "aba", "bab", "abab", "baba", "ababa", "" (empty substring).
Example 2
Input: s = "aaa"
Output: 4
Explanation: Distinct substrings are
"a", "aa", "aaa", "" (empty substring).
Intuition and Approach
We can solve this in two ways:
Approach 1 – Brute Force with Set
- Generate all possible substrings of
s
. - Store them in a set (to ensure uniqueness).
- Return the size of the set + 1 (to count the empty substring).
This works but is inefficient for larger strings since substring generation is costly.
Code – Brute Force
def countDistinctSubstrings(s):
n = len(s)
substrings = set()
# Generate all substrings
for i in range(n):
temp = ""
for j in range(i, n):
temp += s[j]
substrings.add(temp)
# +1 to include the empty substring
return len(substrings) + 1
Dry Run – Brute Force
Input: "ababa"
- i = 0 → “a”, “ab”, “aba”, “abab”, “ababa”
- i = 1 → “b”, “ba”, “bab”, “baba”
- i = 2 → “a”, “ab”, “aba” (already counted)
- i = 3 → “b”, “ba” (already counted)
- i = 4 → “a” (already counted)
Unique substrings = 9 + empty substring = 10.
Complexity – Brute Force
- Time Complexity: O(N²) substrings × O(N) for building strings = O(N³).
- Space Complexity: O(N²) for storing substrings in a set.
This approach is simple but not scalable for larger strings.
Approach 2 – Optimized Trie Solution
Instead of generating all substrings explicitly, we can:
- Insert all suffixes of the string into a Trie.
- While inserting each suffix, count how many new nodes are created.
- Each new node corresponds to a new unique substring.
- Add
+1
for the empty substring.
This avoids duplicate substrings automatically, since the Trie merges common prefixes.
Code – Trie Approach
class TrieNode:
def __init__(self):
self.children = {}
class Trie:
def __init__(self):
self.root = TrieNode()
def insert_and_count(self, word):
node = self.root
count_new_nodes = 0
for ch in word:
if ch not in node.children:
node.children[ch] = TrieNode()
count_new_nodes += 1
node = node.children[ch]
return count_new_nodes
def countDistinctSubstrings(s):
trie = Trie()
result = 0
# Insert all suffixes
for i in range(len(s)):
suffix = s[i:]
result += trie.insert_and_count(suffix)
# +1 for empty substring
return result + 1
Dry Run – Trie Approach
Input: "ababa"
- Insert suffix
"ababa"
→ creates 5 new nodes. - Insert suffix
"baba"
→ creates 3 new nodes (overlaps with"ababa"
). - Insert suffix
"aba"
→ creates 1 new node. - Insert suffix
"ba"
→ creates 1 new node. - Insert suffix
"a"
→ creates 0 new nodes (already exists).
Total = 10 (including empty substring).
Complexity – Trie Approach
- Time Complexity: O(N²), since we insert N suffixes, each up to length N.
- Space Complexity: O(N²), since at most all substrings could be stored in the Trie.
Although both are O(N²), the Trie avoids building duplicate substrings and is significantly faster in practice compared to brute force.
Conclusion
The Count Distinct Substrings problem can be solved in two ways:
- Brute Force (Set): Simple, but slow (O(N³)).
- Trie Approach: Efficient, avoids duplicates naturally, and better suited for large inputs.
The Trie-based solution is the optimal approach for this problem, leveraging prefix sharing to reduce redundant work.