Trie Intro
#include <bits/stdc++.h> // Includes common standard libraries like iostream, string, vector (for char array), etc.using namespace std; // Use the standard namespace.
// Class representing a single node in the Trie.class TrieNode {public: char data; // The character stored in this node. // An array of pointers to child TrieNodes. Size 26 for 'A' through 'Z'. TrieNode* children[26]; bool isTerminal; // True if a word ends at this node.
// Constructor for TrieNode. TrieNode(char ch) { this->data = ch; this->isTerminal = false; // Initially, no word ends here.
// Initialize all child pointers to NULL. for(int i = 0; i < 26; i++) { this->children[i] = NULL; } } // Note: For production code, a destructor would be needed to recursively delete children // to prevent memory leaks when the Trie itself is destroyed.};
// Class representing the Trie data structure.class Trie {public: TrieNode* root; // The root node of the Trie.
// Constructor for Trie. Initializes the root node with a null character. Trie() { root = new TrieNode('\0'); }
// --- Insertion Logic --- // Private helper function for inserting a word into the Trie. // 'root': Current node being processed (starts from the main Trie root). // 'word': The remaining part of the word to insert. void insertUtil(TrieNode* root, string word) { // Base case: If the word becomes empty, we've inserted all its characters. if(word.length() == 0) { root->isTerminal = true; // Mark this node as the end of a word. return; }
// Assumption: Word contains only uppercase English letters ('A' through 'Z'). int index = word[0] - 'A'; // Calculate the index for the current character. TrieNode* child;
// Check if the child node for the current character already exists. if(root->children[index] != NULL) { // Character is already present in the Trie path. child = root->children[index]; } else { // Character is absent. Create a new TrieNode for it. child = new TrieNode(word[0]); root->children[index] = child; // Link the new child node. }
// Recursive call: Process the rest of the word from the child node. insertUtil(child, word.substr(1)); }
// Public method to insert a word. void insertNode(string word) { insertUtil(root, word); }
// --- Search Logic --- // Private helper function for searching a word in the Trie. // 'root': Current node being processed. // 'word': The remaining part of the word to search for. bool searchUtil(TrieNode* root, string word) { // Base case: If the word is empty, we've traversed all its characters. // Return true only if this node marks the end of a word. if(word.length() == 0) { return root->isTerminal; }
// Calculate index for the current character. int index = word[0] - 'A'; TrieNode* child;
// Check if the child node for the current character exists. if(root->children[index] != NULL) { // Character is present, move to the child. child = root->children[index]; } else { // Character is absent, meaning the word is not in the Trie. return false; }
// Recursive call: Continue searching from the child node with the rest of the word. return searchUtil(child, word.substr(1)); }
// Public method to search for a word. bool search(string word) { return searchUtil(root, word); }
// --- Removal (Logical Deletion) Logic --- // Private helper function to logically remove a word (just marks isTerminal false). // This function doesn't free memory or remove nodes, only unmarks a word. bool removeUtil(TrieNode* root, string word) { // Base case: If the word is empty, we've reached the node for the word. if(word.length() == 0) { if(root->isTerminal) { root->isTerminal = false; // Unmark as terminal. Word is logically removed. return true; // Successfully removed. } else { return false; // Word path exists, but it was not a terminal word. } }
// Calculate index for the current character. int index = word[0] - 'A'; TrieNode* child;
// Check if the child node exists. if(root->children[index] != NULL) { child = root->children[index]; } else { // Character path doesn't exist, so the word is not in the Trie. return false; }
// Recursively call removeUtil on the child. return removeUtil(child, word.substr(1)); }
// Public method to logically remove a word. bool remove(string word) { return removeUtil(root, word); }
// --- Erasure (Physical Deletion) Logic --- // This implementation has a CRITICAL FLAW for actual physical deletion. // It will cause memory corruption and crashes if nodes are shared prefixes of other words. bool eraseUtil(TrieNode* root, string word) { // Base case: Reached the end of the word. if(word.length() == 0) { if(root->isTerminal) { root->isTerminal = false; // Unmark as terminal. // CRITICAL FLAW: Deleting 'root' here is premature. // 'root' could be a prefix for other words or have other children. // A node can only be deleted if it is NOT terminal AND has no other children. // delete root; // This line should be managed carefully by parent. return true; // Word was found and un-marked. } else { return false; // Word path exists, but it was not a terminal word. } }
// Calculate index for the current character. int index = word[0] - 'A'; TrieNode* child;
// Check if the child node exists. if(root->children[index] != NULL) { child = root->children[index]; } else { return false; // Word not found. }
// Recursive call to erase the rest of the word. bool word_deleted = eraseUtil(child, word.substr(1));
// CRITICAL FLAW: 'delete child;' here unconditionally deletes the child // even if it's part of another word's path or has other branches. // Proper deletion requires checking if 'child' has no other children AND is not terminal. // If it's safe to delete, then root->children[index] should be set to NULL. // delete child; // This will cause issues for common prefixes.
return word_deleted; }
// Public method for attempting to physically erase a word. // WARNING: The implementation of eraseUtil has a flaw that can lead to memory errors. bool erase(string word) { return eraseUtil(root, word); }};
int main() {
Trie* T = new Trie(); // Create a new Trie object.
// Insert the word "AKASH". T->insertNode("AKASH");
// Search for "AKASH". bool present = T->search("AKASH"); if(present) { cout << "Word is Present!" << endl; // Expected: Word is Present! } else { cout << "Word is Absent!" << endl; }
// Logical removal of "AKASH" (sets isTerminal to false). T->remove("AKASH"); // This is the 'logical' removal function. present = T->search("AKASH"); if(present) { cout << "Word is Present!" << endl; } else { cout << "Word is Absent!" << endl; // Expected: Word is Absent! }
// Attempting to access a child after logical removal. // The node for 'A' should still exist if 'AKASH' was the only word. // This line might crash if 'delete child' was used from a faulty physical delete. // With 'remove', it's just logically removed, so nodes still exist. // This line accesses root->children[0] (for 'A') and then its data. cout << "Data of first child of root: " << T->root->children[0]->data << endl; // Expected: A
// Re-insert "AKASH". T->insertNode("AKASH"); present = T->search("AKASH"); if(present) { cout << "Word is Present!" << endl; // Expected: Word is Present! } else { cout << "Word is Absent!" << endl; }
// Test the flawed 'erase' function. // If you uncomment and run this with other words, it might crash. // T->insertNode("APPLE"); // T->insertNode("APP"); // T->erase("APP"); // This will likely delete the node for the second 'P', corrupting "APPLE".
return 0;}
1. What is a Trie?
- A Trie (pronounced “try” or “tree”, from retrieval tree) is a tree-like data structure used to store a dynamic set of strings or associative array where keys are strings.
- Purpose: Enables efficient retrieval of strings based on prefixes. Common uses include autocomplete, spell checkers, dictionary implementations, and IP routing.
- Structure:
- Each node in the Trie represents a character.
- Edges from a node lead to its children, which represent the next characters in a sequence.
- A word is formed by traversing a path from the root.
- A special boolean flag (
isTerminal
) at each node indicates if a complete word ends at that node.
- Advantages:
- Faster lookups than hash tables in worst case (no collisions).
- Efficient for prefix matching.
- Alphabetical ordering of keys is implicit.
- Disadvantages:
- Can consume more memory than hash tables for storing a large number of short strings, especially if the alphabet size is large.
2. TrieNode
Class
char data
: The character this node represents. (Often, the root node stores a null character'\0'
).TrieNode* children[26]
: An array of pointers to child nodes. The size26
is for the English alphabet (A-Z or a-z, depending on conversion logic).children[i]
points to the node for thei
-th character.NULL
means no child exists for that character.bool isTerminal
:true
if a word ends at this node;false
otherwise. This is crucial to distinguish a prefix from a complete word (e.g., “APP” vs “APPLE”).- Constructor: Initializes
data
, setsisTerminal
tofalse
, and allchildren
pointers toNULL
.
3. Trie
Class Methods
Trie()
Constructor: Initializes the Trie with aroot
node (typically'\0'
).insertUtil(TrieNode* &root, string word)
(Recursive Helper):- Base Case: If
word
is empty, it means we’ve processed all characters. Setroot->isTerminal = true
to mark the end of a valid word. - Recursive Step:
- Determine
index
forword[0]
(e.g.,word[0] - 'A'
for uppercase). - If
root->children[index]
isNULL
, create a newTrieNode
forword[0]
and link it. - Recursively call
insertUtil
on thischild
node with theword.substr(1)
(remaining part).
- Determine
- Base Case: If
insertNode(string word)
: Public wrapper forinsertUtil
.searchUtil(TrieNode* root, string word)
(Recursive Helper):- Base Case: If
word
is empty, returnroot->isTerminal
. (If we reached here and this node is terminal, the word exists; otherwise, it’s just a prefix). - Recursive Step:
- Determine
index
forword[0]
. - If
root->children[index]
isNULL
, the character path does not exist. Returnfalse
. - Else, recursively call
searchUtil
on thechild
node withword.substr(1)
.
- Determine
- Base Case: If
search(string word)
: Public wrapper forsearchUtil
.removeUtil(TrieNode* root, string word)
(Logical Deletion):- Base Case: If
word
is empty, androot->isTerminal
is true, setroot->isTerminal = false
. This logically removes the word without deleting nodes. - Recursive Step: Traverse path. If character path doesn’t exist, word isn’t there.
- Base Case: If
remove(string word)
: Public wrapper forremoveUtil
.eraseUtil(TrieNode* root, string word)
(Physical Deletion - CRITICAL FLAW IN PROVIDED IMPLEMENTATION):- Intention: To physically remove nodes that are no longer part of any word.
- Base Case: If
word
is empty, ifroot->isTerminal
is true, setroot->isTerminal = false
. The provideddelete root;
here is problematic. You should only delete a node if it’s no longer part of any other word AND has no other children. - Recursive Step:
- Traverse to the child.
bool ans = eraseUtil(child, word.substr(1));
- The return valueans
here indicates if the word was found and marked non-terminal.delete child;
: This line is the major flaw. Ifchild
has other children (meaning it’s a prefix for another word like “APP” is for “APPLE”), or ifchild
is itself a terminal node for another word, deleting it will corrupt the Trie.
- Correct Physical Deletion Logic (Conceptual):
- Perform a logical removal (
isTerminal = false
). - After the recursive call returns, check the
child
node:- Is
child->isTerminal
false
? (No other word ends here). - Does
child
have no other children (i.e., allchildren[i]
areNULL
)? - If both conditions are true, then it’s safe to
delete child
. - Also, if a node is deleted, its parent’s pointer to it must be set to
NULL
.
- Is
- Perform a logical removal (
erase(string word)
: Public wrapper foreraseUtil
. Be cautious using the providederase
due to the mentioned flaw.
4. Complexity Analysis
- Let
L
be the length of the word being inserted/searched/removed. - Let
A
be the size of the alphabet (e.g., 26 for English letters). - Insertion:
O(L)
- We traverseL
nodes, and each step isO(1)
(array access). - Search:
O(L)
- Similar traversal. - Removal (
remove
- logical deletion):O(L)
- Similar traversal to find and updateisTerminal
. - Erasure (
erase
- physical deletion with flawed implementation):O(L)
for traversal, but leads to incorrect behavior and potential crashes. A correctly implemented physical deletion would also beO(L)
. - Space Complexity:
O(Total_Characters_Stored * A)
orO(Total_Nodes_in_Trie * A)
. In the worst case (no common prefixes), it’sO(Sum_of_Lengths_of_All_Words * A)
.