You want to master the Union-Find data structure—also known as Disjoint Set Union (DSU)—a powerful tool for tracking and merging disjoint sets of elements. Its core purpose is to efficiently manage two key operations: find (determine which set an element belongs to) and union (merge two sets), while avoiding redundant work through optimizations like path compression and union by rank/size.
Union-Find is the backbone of algorithms like Kruskal’s (MST), used in network design, cluster analysis, and cycle detection. It shines because its operations are nearly constant time (amortized O(α(V)), where α is the inverse Ackermann function—effectively O(1) for all practical purposes).
Core Concepts
Before diving into implementation, let’s clarify the basics:
- Disjoint Sets: Sets with no overlapping elements (e.g., {0,1,2} and {3,4} are disjoint).
- Find Operation: Given an element, find the root (representative) of its set. The root is a unique identifier for the set (e.g., if 0 is the root of {0,1,2},
find(2)returns 0). - Union Operation: Merge two sets into one by linking their roots (e.g., merging {0,1,2} and {3,4} creates {0,1,2,3,4} with a single root).
- Key Optimizations:
- Path Compression: Flattens the structure of the tree during
find, so future queries are faster. - Union by Rank/Size: Ensures the smaller tree is always attached to the root of the larger tree, keeping the tree shallow.
- Path Compression: Flattens the structure of the tree during
Union-Find Implementation (Python)
We’ll implement Union-Find with both optimizations (path compression + union by rank) for maximum efficiency. This implementation supports numeric elements (easily adaptable to other types with a hash map).
Full Implementation
python
运行
class UnionFind:
"""
Union-Find (Disjoint Set Union, DSU) data structure with path compression and union by rank.
Supports numeric elements (0-indexed or arbitrary integers).
"""
def __init__(self, size):
"""
Initialize Union-Find with `size` elements (0 to size-1).
Args:
size (int): Number of elements to manage (must be ≥ 1).
"""
if size < 1:
raise ValueError("Size must be at least 1")
# parent[i] = parent of element i (initially, each element is its own parent)
self.parent = list(range(size))
# rank[i] = upper bound of the height of the tree rooted at i (for union by rank)
self.rank = [0] * size
# Optional: Track size of each set (alternative to rank for union by size)
self.size = [1] * size
def find(self, x):
"""
Find the root of element x with path compression.
Args:
x (int): Element to find (must be between 0 and size-1).
Returns:
int: Root of the set containing x.
"""
# Validate input
if x < 0 or x >= len(self.parent):
raise IndexError("Element out of bounds")
# Path compression: Make x's parent point directly to the root
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # Recursively find root and update parent
return self.parent[x]
def union(self, x, y):
"""
Merge the sets containing elements x and y using union by rank.
Args:
x (int): First element (0 ≤ x < size).
y (int): Second element (0 ≤ y < size).
Returns:
bool: True if sets were merged (x and y were in different sets), False otherwise (already in the same set).
"""
# Find roots of x and y
root_x = self.find(x)
root_y = self.find(y)
# If already in the same set, do nothing
if root_x == root_y:
return False
# Union by rank: Attach smaller rank tree to root of larger rank tree
if self.rank[root_x] < self.rank[root_y]:
self.parent[root_x] = root_y
self.size[root_y] += self.size[root_x] # Update size of root_y's set
else:
self.parent[root_y] = root_x
self.size[root_x] += self.size[root_y] # Update size of root_x's set
# If ranks are equal, increment rank of the new root
if self.rank[root_x] == self.rank[root_y]:
self.rank[root_x] += 1
return True
def connected(self, x, y):
"""
Check if elements x and y are in the same set.
Args:
x (int): First element.
y (int): Second element.
Returns:
bool: True if x and y are in the same set, False otherwise.
"""
return self.find(x) == self.find(y)
def get_set_size(self, x):
"""
Get the size of the set containing element x.
Args:
x (int): Element to query.
Returns:
int: Size of the set.
"""
root = self.find(x)
return self.size[root]
def get_num_sets(self):
"""
Get the number of disjoint sets currently managed.
Returns:
int: Number of disjoint sets.
"""
# Count elements that are their own parent (roots)
return sum(1 for i in range(len(self.parent)) if self.parent[i] == i)
# Test the Union-Find implementation
if __name__ == "__main__":
# Initialize Union-Find with 6 elements (0-5)
uf = UnionFind(6)
print("Initial number of sets:", uf.get_num_sets()) # Output: 6
# Merge sets
uf.union(0, 1)
uf.union(1, 2)
uf.union(3, 4)
print("\nAfter merging (0-1-2) and (3-4):")
print("Number of sets:", uf.get_num_sets()) # Output: 3 (sets: {0,1,2}, {3,4}, {5})
print("0 and 2 are connected?", uf.connected(0, 2)) # Output: True
print("2 and 3 are connected?", uf.connected(2, 3)) # Output: False
print("Size of set containing 1:", uf.get_set_size(1)) # Output: 3
# Merge more sets
uf.union(2, 3)
uf.union(4, 5)
print("\nAfter merging (0-1-2-3-4) and (5):")
print("Number of sets:", uf.get_num_sets()) # Output: 1 (all elements connected)
print("0 and 5 are connected?", uf.connected(0, 5)) # Output: True
print("Size of set containing 5:", uf.get_set_size(5)) # Output: 6
Sample Output
plaintext
Initial number of sets: 6
After merging (0-1-2) and (3-4):
Number of sets: 3
0 and 2 are connected? True
2 and 3 are connected? False
Size of set containing 1: 3
After merging (0-1-2-3-4) and (5):
Number of sets: 1
0 and 5 are connected? True
Size of set containing 5: 6
How the Optimizations Work
1. Path Compression (in find method)
Without path compression, the tree can become deep (e.g., 0 → 1 → 2 → 3 → 4), making find(4) require 4 steps. With path compression:
- When
find(4)is called, it recursively finds the root (0) and updates all nodes along the path to point directly to 0. - Subsequent calls to
find(2)orfind(4)will take only 1 step (direct access to root).
2. Union by Rank (in union method)
Without union by rank, merging a tall tree into a short tree can create a deep tree. Union by rank ensures:
- The shorter tree is always attached to the root of the taller tree.
- If trees are the same height, the new root’s rank is incremented (to track the maximum possible height).
- This keeps the tree height logarithmic in the number of elements, ensuring fast
findoperations.
Time and Space Complexity
| Operation | Amortized Time Complexity | Explanation |
|---|---|---|
find(x) | O(α(V)) | α(V) is the inverse Ackermann function—grows extremely slowly (α(V) < 5 for V < 10^600). Effectively O(1) in practice. |
union(x, y) | O(α(V)) | Depends on two find operations (each O(α(V))) plus constant-time root linking. |
connected(x,y) | O(α(V)) | Two find operations. |
get_set_size(x) | O(α(V)) | One find operation plus constant-time size lookup. |
| Space Complexity | Value | Explanation |
|---|---|---|
| Overall | O(V) | Stores parent, rank, and size arrays of size V (number of elements). |
Pros and Cons of Union-Find
Pros
- Near-Constant Time Operations: Amortized O(α(V)) time for all core operations—faster than most other data structures for set management.
- Simple to Implement: Core logic is straightforward, with optimizations adding minimal complexity.
- Memory Efficient: Uses O(V) space, which is optimal for tracking V elements.
- Versatile: Powers critical algorithms (Kruskal’s, cycle detection) and works for diverse use cases (clustering, network connectivity).
Cons
- Limited to Numeric Elements (by Default): To handle non-numeric elements (e.g., strings, objects), you need a hash map to map elements to numeric indices.
- No Split Operation: Union-Find does not support splitting a set into two disjoint sets efficiently (this would require a more complex data structure like a link-cut tree).
- Requires Fixed Initial Size: The initial size of the Union-Find structure is fixed (though you can extend it with dynamic resizing if needed).
Real-World Applications of Union-Find
- Kruskal’s Algorithm (MST): Uses Union-Find to avoid cycles while adding edges to the MST (critical for network design with minimum cost).
- Cycle Detection: In undirected graphs—for each edge (u, v), if
find(u) == find(v), a cycle exists. - Network Connectivity: Tracking which nodes in a network are connected (e.g., internet routers, social networks).
- Cluster Analysis: Grouping similar data points (e.g., image segmentation, customer segmentation).
- Dynamic Connectivity: Answering connectivity queries in real-time (e.g., “Is node A connected to node B after adding edge C?”).
- Games: Managing player teams or territories (e.g., in strategy games where players merge regions).
Summary
Best use cases: Kruskal’s MST, cycle detection, network connectivity, and cluster analysis.
Union-Find (DSU) is a data structure for managing disjoint sets with two core operations: find (find root) and union (merge sets).
Key optimizations: path compression (speeds up find) and union by rank/size (keeps trees shallow).
Time complexity: Amortized O(α(V)) per operation (effectively O(1)); Space complexity: O(V).
- iPhone 15 Pro Review: Ultimate Features and Specs
- iPhone 15 Pro Max: Key Features and Specifications
- iPhone 16: Features, Specs, and Innovations
- iPhone 16 Plus: Key Features & Specs
- iPhone 16 Pro: Premium Features & Specs Explained
- iPhone 16 Pro Max: Features & Innovations Explained
- iPhone 17 Pro: Features and Innovations Explained
- iPhone 17 Review: Features, Specs, and Innovations
- iPhone Air Concept: Mid-Range Power & Portability
- iPhone 13 Pro Max Review: Features, Specs & Performance
- iPhone SE Review: Budget Performance Unpacked
- iPhone 14 Review: Key Features and Upgrades
- Apple iPhone 14 Plus: The Ultimate Mid-range 5G Smartphone
- iPhone 14 Pro: Key Features and Innovations Explained
- Why the iPhone 14 Pro Max Redefines Smartphone Technology
- iPhone 15 Review: Key Features and Specs
- iPhone 15 Plus: Key Features and Specs Explained
- iPhone 12 Mini Review: Compact Powerhouse Unleashed
- iPhone 12: Key Features and Specs Unveiled
- iPhone 12 Pro: Premium Features and 5G Connectivity
- Why the iPhone 12 Pro Max is a Top Choice in 2023
- iPhone 13 Mini: Compact Powerhouse in Your Hand
- iPhone 13: Key Features and Specs Overview
- iPhone 13 Pro Review: Features and Specifications






















Leave a comment