How to Implement and Use Binary Search Trees in Python

You want to learn about Binary Search Trees (BST)—a fundamental hierarchical data structure that organizes nodes in a way that enables efficient searching, insertion, and deletion. A BST follows a strict ordering rule: for any node, all values in its left subtree are smaller than the node’s value, and all values in its right subtree are larger (duplicates are typically not allowed or handled with specific rules).

BSTs are the foundation for more advanced structures like AVL trees and Red-Black trees, and they’re widely used in databases, file systems, and search applications.


Core Properties of a BST

To qualify as a BST, a binary tree must satisfy these rules for every node:

  1. All nodes in the left subtree have values < current node’s value.
  2. All nodes in the right subtree have values > current node’s value.
  3. No duplicate values (by default—can be modified to allow duplicates by storing counts).
  4. Both left and right subtrees are also BSTs (recursive property).

Example of a Valid BST

plaintext

        8 (Root)
       / \
      3   10
     / \    \
    1   6    14
       / \   /
      4   7 13

  • Left subtree of 8: [3, 1, 6, 4, 7] (all < 8)
  • Right subtree of 8: [10, 14, 13] (all > 8)
  • This structure allows efficient searching (e.g., find 6 by comparing 8→3→6).

Key Terminology

  • Root: The topmost node (starting point of the tree).
  • Node: Contains a value, a left child pointer, and a right child pointer.
  • Leaf Node: A node with no left/right children (e.g., 1, 4, 7, 13 in the example).
  • Subtree: A tree formed by a node and all its descendants.
  • Height: The number of edges from a node to the deepest leaf (height of root = 3 in the example).

Binary Search Tree Implementation (Python)

We’ll implement a BST with core operations: insertsearchdeletetraversal (in-order, pre-order, post-order), and helper functions to find the minimum/maximum nodes.

Full Implementation

python

运行

class TreeNode:
    """Node class for Binary Search Tree (BST)."""
    def __init__(self, value):
        self.value = value  # Value stored in the node
        self.left = None    # Pointer to left child (smaller values)
        self.right = None   # Pointer to right child (larger values)

class BinarySearchTree:
    def __init__(self):
        self.root = None  # Root of the BST (initially empty)
    
    # -------------------------- Insert Operation --------------------------
    def insert(self, value):
        """Insert a value into the BST (no duplicates allowed)."""
        new_node = TreeNode(value)
        
        # Case 1: BST is empty—new node becomes root
        if self.root is None:
            self.root = new_node
            return True
        
        # Case 2: BST is not empty—find correct position
        current = self.root
        while True:
            # Duplicate value (skip or handle as needed)
            if value == current.value:
                print(f"Value {value} already exists in BST—skipping.")
                return False
            
            # Value is smaller—go left
            if value < current.value:
                if current.left is None:
                    current.left = new_node
                    return True
                current = current.left
            # Value is larger—go right
            else:
                if current.right is None:
                    current.right = new_node
                    return True
                current = current.right
    
    # -------------------------- Search Operation --------------------------
    def search(self, value):
        """Search for a value in the BST. Return True if found, False otherwise."""
        current = self.root
        while current is not None:
            if value == current.value:
                return True  # Value found
            elif value < current.value:
                current = current.left  # Search left subtree
            else:
                current = current.right  # Search right subtree
        return False  # Value not found
    
    # -------------------------- Traversal Operations --------------------------
    def _in_order_helper(self, node, result):
        """Helper for in-order traversal (recursive)."""
        if node is not None:
            self._in_order_helper(node.left, result)  # Traverse left subtree
            result.append(node.value)                 # Visit current node
            self._in_order_helper(node.right, result) # Traverse right subtree
    
    def in_order_traversal(self):
        """In-order traversal (Left → Root → Right) — returns sorted values!"""
        result = []
        self._in_order_helper(self.root, result)
        return result
    
    def _pre_order_helper(self, node, result):
        """Helper for pre-order traversal (recursive)."""
        if node is not None:
            result.append(node.value)                 # Visit current node
            self._pre_order_helper(node.left, result)  # Traverse left subtree
            self._pre_order_helper(node.right, result) # Traverse right subtree
    
    def pre_order_traversal(self):
        """Pre-order traversal (Root → Left → Right) — useful for copying the tree."""
        result = []
        self._pre_order_helper(self.root, result)
        return result
    
    def _post_order_helper(self, node, result):
        """Helper for post-order traversal (recursive)."""
        if node is not None:
            self._post_order_helper(node.left, result)  # Traverse left subtree
            self._post_order_helper(node.right, result) # Traverse right subtree
            result.append(node.value)                 # Visit current node
    
    def post_order_traversal(self):
        """Post-order traversal (Left → Right → Root) — useful for deleting the tree."""
        result = []
        self._post_order_helper(self.root, result)
        return result
    
    # -------------------------- Helper Functions (for Delete) --------------------------
    def _find_min_node(self, node):
        """Find the node with the minimum value in a subtree (always the leftmost node)."""
        current = node
        while current.left is not None:
            current = current.left
        return current
    
    # -------------------------- Delete Operation --------------------------
    def delete(self, value):
        """Delete a node with the given value from the BST. Return True if deleted, False otherwise."""
        # Recursive helper function to handle deletion logic
        def _delete_helper(node, value):
            # Base case: Node not found
            if node is None:
                return None
            
            # Step 1: Find the node to delete
            if value < node.value:
                node.left = _delete_helper(node.left, value)
            elif value > node.value:
                node.right = _delete_helper(node.right, value)
            # Step 2: Node found—handle 3 cases
            else:
                # Case 1: Node is a leaf (no children)
                if node.left is None and node.right is None:
                    return None
                
                # Case 2: Node has only one child (left or right)
                elif node.left is None:
                    return node.right  # Replace node with right child
                elif node.right is None:
                    return node.left   # Replace node with left child
                
                # Case 3: Node has two children—find successor (min of right subtree)
                successor = self._find_min_node(node.right)
                node.value = successor.value  # Replace node's value with successor's value
                node.right = _delete_helper(node.right, successor.value)  # Delete the successor
        
            return node
        
        # Update root after deletion
        original_root = self.root
        self.root = _delete_helper(self.root, value)
        return self.root != original_root  # Return True if deletion occurred

# Test the Binary Search Tree
if __name__ == "__main__":
    # Initialize BST
    bst = BinarySearchTree()
    
    # Insert values
    values = [8, 3, 10, 1, 6, 14, 4, 7, 13]
    for val in values:
        bst.insert(val)
    
    # Traversals
    print("In-order Traversal (sorted):", bst.in_order_traversal())    # Output: [1, 3, 4, 6, 7, 8, 10, 13, 14]
    print("Pre-order Traversal:", bst.pre_order_traversal())          # Output: [8, 3, 1, 6, 4, 7, 10, 14, 13]
    print("Post-order Traversal:", bst.post_order_traversal())        # Output: [1, 4, 7, 6, 3, 13, 14, 10, 8]
    
    # Search
    print("\nIs 6 present?", bst.search(6))  # Output: True
    print("Is 9 present?", bst.search(9))    # Output: False
    
    # Delete
    bst.delete(6)  # Delete node with two children
    print("\nAfter deleting 6 (in-order):", bst.in_order_traversal()) # Output: [1, 3, 4, 7, 8, 10, 13, 14]
    
    bst.delete(10) # Delete node with one child
    print("After deleting 10 (in-order):", bst.in_order_traversal())  # Output: [1, 3, 4, 7, 8, 13, 14]
    
    bst.delete(1)  # Delete leaf node
    print("After deleting 1 (in-order):", bst.in_order_traversal())   # Output: [3, 4, 7, 8, 13, 14]


Critical BST Operations Explained

1. Insertion

  • Start at the root and compare the new value with the current node’s value.
  • If the new value is smaller, move to the left child; if larger, move to the right child.
  • Repeat until an empty spot is found (insert the new node there).
  • Time Complexity: O(h) (h = height of the tree; O(log n) for balanced BST, O(n) for skewed BST).

2. Search

  • Similar to insertion: start at the root and compare the target value with the current node’s value.
  • Move left if the target is smaller, right if larger.
  • If the current node is None, the target is not in the BST.
  • Time Complexity: O(h) (same as insertion).

3. Deletion (Trickiest Operation)

Deleting a node requires preserving the BST property. There are 3 cases:

  • Case 1: Leaf Node (No Children): Simply remove the node (set its parent’s pointer to None).
  • Case 2: One Child: Replace the node with its child (set the parent’s pointer to the child).
  • Case 3: Two Children: Replace the node’s value with its successor (the smallest node in its right subtree) or predecessor (the largest node in its left subtree), then delete the successor/predecessor.
  • Time Complexity: O(h) (dominated by finding the node and its successor).

4. Traversals

  • In-order: Left → Root → Right → Returns values in sorted order (BST’s most useful traversal).
  • Pre-order: Root → Left → Right → Useful for copying the tree or creating a prefix expression.
  • Post-order: Left → Right → Root → Useful for deleting the tree or creating a postfix expression.

Time and Space Complexity

OperationAverage Case (Balanced BST)Worst Case (Skewed BST)Explanation
InsertO(log n)O(n)Balanced BST has height log n; skewed BST (like a linked list) has height n.
SearchO(log n)O(n)Same as insertion.
DeleteO(log n)O(n)Same as insertion (plus finding successor/predecessor).
Traversals (In/Pre/Post)O(n)O(n)Must visit all n nodes.
Space Complexity (Recursive Traversals)O(h)Recursion stack depth = height of the tree.

Key Note: Skewed BSTs

A skewed BST is a worst-case scenario where the tree degenerates into a linked list (e.g., inserting values in sorted order: 1→2→3→4→5). This makes operations O(n) instead of O(log n). To fix this, use self-balancing BSTs (AVL, Red-Black trees), which automatically maintain balance.


Pros and Cons of BSTs

Pros

  1. Efficient Sorted Operations: In-order traversal returns sorted values in O(n) time.
  2. Faster Than Arrays for Dynamic Data: Insertions/deletions are O(log n) (balanced) vs. O(n) for arrays (shifting elements).
  3. No Preallocated Memory: Grows dynamically as nodes are added (no wasted space).
  4. Foundation for Advanced Structures: Self-balancing BSTs (AVL, Red-Black) build on BST logic for guaranteed O(log n) performance.

Cons

  1. Unbalanced BSTs Are Slow: Skewed BSTs degrade to O(n) time for core operations.
  2. No Random Access: Cannot access elements by index—must traverse from the root.
  3. Complex Deletion: Deleting nodes with two children requires extra steps (finding successors).
  4. Poor Cache Performance: Nodes are not stored in contiguous memory (worse than arrays).

Real-World Applications of BSTs

  1. Databases: Indexing (e.g., B-trees and B+ trees, which are balanced BST variants, power database indexing for fast queries).
  2. Search Engines: Storing and retrieving sorted data (e.g., autocomplete suggestions).
  3. File Systems: Organizing files and directories for efficient access (e.g., NTFS uses B+ trees).
  4. Binary Search Optimization: BSTs enable binary search-like performance for dynamic data (arrays require static data for binary search).
  5. Expression Parsers: Converting infix expressions to postfix/prefix using traversals (e.g., syntax trees).

Summary

Best use cases: Dynamic sorted data, indexing, and applications requiring efficient insertions/searches.

A BST is a hierarchical data structure where left subtree values < current node < right subtree values.

Core operations: Insert (O(h)), Search (O(h)), Delete (O(h)), and traversals (O(n)).

Key feature: In-order traversal returns sorted values—unique to BSTs.

Limitation: Unbalanced BSTs have O(n) performance; use self-balancing BSTs (AVL/Red-Black) for guaranteed O(log n).



了解 Ruigu Electronic 的更多信息

订阅后即可通过电子邮件收到最新文章。

Posted in

Leave a comment