Kruskal’s Algorithm Explained: A Guide to MST

You want to learn about Kruskal’s Algorithm—a greedy algorithm used to find the Minimum Spanning Tree (MST) of a weighted undirected graph. An MST is a subset of the graph’s edges that connects all vertices (nodes) with the minimum possible total edge weight, and contains no cycles.

Kruskal’s Algorithm is ideal for sparse graphs and works by sorting all edges by weight and adding them one by one to the MST, while avoiding cycles using a Union-Find (Disjoint Set Union, DSU) data structure. It is widely used in network design (e.g., laying cables, connecting routers) and cluster analysis.


Core Concepts

Before diving into the algorithm, let’s clarify key terms:

  • Minimum Spanning Tree (MST): For a graph with V vertices, an MST has exactly V-1 edges (no cycles) and connects all vertices with the smallest possible sum of edge weights. If the graph is disconnected, it has no MST (or multiple MSTs for each connected component).
  • Union-Find (DSU): A data structure that efficiently tracks and merges disjoint sets of vertices. It supports two core operations:
    • find(u): Finds the root (representative) of the set containing vertex u (with path compression to speed up future queries).
    • union(u, v): Merges the sets containing vertices u and v (with union by rank/size to keep the tree shallow).

Core Principles of Kruskal’s Algorithm

Kruskal’s Algorithm follows these greedy steps to build an MST:

  1. Sort All Edges: Sort all edges of the graph in ascending order of their weights (prioritize lighter edges first).
  2. Initialize Union-Find: Create a Union-Find structure to track connected components of vertices.
  3. Build MST: Iterate through the sorted edges:
    • For each edge (u, v, weight):
      • If u and v are in different connected components (checked via find(u) != find(v)):
        • Add the edge to the MST.
        • Merge the components containing u and v (via union(u, v)).
  4. Terminate: Stop when the MST contains V-1 edges (all vertices are connected) or all edges are processed (graph is disconnected).

Why This Works

By always choosing the lightest edge that connects two disjoint components, Kruskal’s Algorithm ensures the MST has the minimum total weight (greedy choice property). The Union-Find structure efficiently prevents cycles by only adding edges that connect separate components.


Kruskal’s Algorithm Implementation (Python)

We’ll implement Kruskal’s Algorithm with a Union-Find data structure optimized for performance (path compression + union by rank). The graph is represented as a list of edges (each edge is a tuple (weight, u, v)).

Full Implementation

python

运行

class UnionFind:
    """Union-Find (Disjoint Set Union) data structure with path compression and union by rank."""
    def __init__(self, num_vertices):
        self.parent = list(range(num_vertices))  # Parent of each vertex (initially itself)
        self.rank = [0] * num_vertices  # Rank (depth) of each set (for union by rank)
    
    def find(self, u):
        """Find the root of vertex u with path compression."""
        if self.parent[u] != u:
            self.parent[u] = self.find(self.parent[u])  # Path compression: flatten the tree
        return self.parent[u]
    
    def union(self, u, v):
        """Merge the sets containing u and v using union by rank."""
        root_u = self.find(u)
        root_v = self.find(v)
        
        if root_u == root_v:
            return False  # u and v are already in the same set (cycle detected)
        
        # Attach smaller rank tree under root of larger rank tree
        if self.rank[root_u] < self.rank[root_v]:
            self.parent[root_u] = root_v
        else:
            self.parent[root_v] = root_u
            if self.rank[root_u] == self.rank[root_v]:
                self.rank[root_u] += 1  # Increase rank if trees were equal
        return True

def kruskals_algorithm(graph, num_vertices):
    """
    Kruskal's Algorithm to find the Minimum Spanning Tree (MST) of a weighted undirected graph.
    Args:
        graph (list): List of edges, each edge is a tuple (weight, u, v) (u and v are 0-indexed vertices).
        num_vertices (int): Number of vertices in the graph.
    Returns:
        tuple: (mst_edges, mst_total_weight)
            - mst_edges: List of edges in the MST (tuples of (weight, u, v)).
            - mst_total_weight: Total weight of the MST (None if graph is disconnected).
    """
    # Step 1: Sort all edges in ascending order of weight
    sorted_edges = sorted(graph, key=lambda x: x[0])
    
    # Step 2: Initialize Union-Find structure
    uf = UnionFind(num_vertices)
    
    mst_edges = []
    mst_total_weight = 0
    
    # Step 3: Iterate through sorted edges and build MST
    for weight, u, v in sorted_edges:
        if uf.union(u, v):  # No cycle—add edge to MST
            mst_edges.append((weight, u, v))
            mst_total_weight += weight
            
            # Early exit: MST is complete when it has V-1 edges
            if len(mst_edges) == num_vertices - 1:
                break
    
    # Check if MST connects all vertices (has V-1 edges)
    if len(mst_edges) != num_vertices - 1:
        return [], None  # Graph is disconnected—no MST exists
    
    return mst_edges, mst_total_weight

# Test the Kruskal's Algorithm
if __name__ == "__main__":
    # Example: Weighted undirected graph (6 vertices, 9 edges)
    # Vertices are 0-indexed: 0, 1, 2, 3, 4, 5
    graph = [
        (2, 0, 1),
        (3, 0, 2),
        (3, 1, 2),
        (4, 1, 3),
        (1, 2, 3),
        (5, 2, 4),
        (4, 3, 4),
        (2, 3, 5),
        (6, 4, 5)
    ]
    num_vertices = 6
    
    mst_edges, mst_weight = kruskals_algorithm(graph, num_vertices)
    
    print("=== Kruskal's Algorithm Results ===")
    if mst_edges:
        print(f"Total weight of MST: {mst_weight}")
        print("Edges in MST (weight, u, v):")
        for edge in mst_edges:
            print(edge)
    else:
        print("The graph is disconnected—no MST exists.")

Sample Output

plaintext

=== Kruskal's Algorithm Results ===
Total weight of MST: 12
Edges in MST (weight, u, v):
(1, 2, 3)
(2, 0, 1)
(2, 3, 5)
(3, 0, 2)
(4, 3, 4)

Explanation of the MST

The MST connects all 6 vertices with 5 edges (V-1 = 5) and a total weight of 12. The edges are chosen to be the lightest possible without forming cycles:

  • Edge (2,3) (weight 1) → lightest edge.
  • Edge (0,1) (weight 2) → connects 0 to the component.
  • Edge (3,5) (weight 2) → connects 5 to the component.
  • Edge (0,2) (weight 3) → connects 2 to 0’s component (avoids cycle with (0,1)+(1,2)).
  • Edge (3,4) (weight 4) → connects 4 to the component.

Time and Space Complexity

MetricValueExplanation
Time ComplexityO(E log E + E α(V))– Sorting edges: O(E log E) (dominant term for most graphs).- Union-Find operations: Each find/union is nearly O(1) (α(V) is the inverse Ackermann function, which is < 5 for all practical V).- Total: O(E log E) (since E log E dominates E α(V)).
Space ComplexityO(V + E)– O(E) for storing the input graph and sorted edges.- O(V) for the Union-Find parent/rank arrays.

Key Notes:

  • Kruskal’s Algorithm is more efficient for sparse graphs (E ≈ V) because sorting O(V log V) edges is faster than Dijkstra-based algorithms like Prim’s (which is better for dense graphs).
  • For dense graphs (E ≈ V²), Prim’s Algorithm with a priority queue is faster (O(E + V log V) vs. Kruskal’s O(V² log V)).

Pros and Cons of Kruskal’s Algorithm

Pros

  1. Efficient for Sparse Graphs: Outperforms Prim’s Algorithm for graphs with few edges (e.g., network routing, where E ≈ V).
  2. Simple to Implement: Relies on sorting and Union-Find—no complex data structures like priority queues (for basic versions).
  3. Handles Large Graphs: Union-Find’s near-constant time operations make it scalable for large graphs with millions of vertices.
  4. Finds MST if It Exists: Guarantees the MST (minimum total weight) if the graph is connected (greedy choice + optimal substructure properties).

Cons

  1. Inefficient for Dense Graphs: Sorting O(V²) edges for dense graphs is slow compared to Prim’s Algorithm (O(E + V log V)).
  2. Requires Edge Sorting: The algorithm’s performance is tied to the sorting step—no way to avoid it for general graphs.
  3. Not Ideal for Dynamic Graphs: Adding/removing edges requires re-sorting and re-running the algorithm (unlike some dynamic MST algorithms).
  4. Only for Undirected Graphs: Designed for undirected graphs—does not work for directed graphs (use Arborescence algorithms instead).

Real-World Applications of Kruskal’s Algorithm

  1. Network Design: Laying fiber-optic cables, electrical grids, or water pipelines to connect cities with minimum cost.
  2. Computer Networks: Designing LAN/WAN networks to connect routers/switches with minimum cable length or bandwidth cost.
  3. Cluster Analysis: Used in hierarchical clustering (e.g., Kruskal-Wallis test) to group similar data points by minimizing the distance between clusters.
  4. Image Processing: Segmenting images by grouping pixels with similar colors/intensities (minimum spanning tree-based segmentation).
  5. Transportation Planning: Finding the cheapest way to connect multiple locations (e.g., airports, train stations) with roads/railways.
  6. Circuit Design: Connecting components on a circuit board with minimum wire length to reduce manufacturing costs and signal delay.

Summary

Best use cases: Sparse graphs, network design, cluster analysis, and large-scale systems.

Kruskal’s Algorithm is a greedy algorithm that finds the MST of a connected weighted undirected graph.

Core steps: Sort edges by weight → use Union-Find to add edges without cycles → stop when MST has V-1 edges.

Time complexity: O(E log E) (dominated by edge sorting); Space complexity: O(V + E).



了解 Ruigu Electronic 的更多信息

订阅后即可通过电子邮件收到最新文章。

Posted in

Leave a comment