Heap Sort: Features, Pros, and Implementation Guide

You want to learn about Heap Sort—a comparison-based sorting algorithm that leverages the heap data structure (a complete binary tree) to efficiently sort elements. Its core idea is to first convert the unsorted array into a max-heap (or min-heap), then repeatedly extract the root element (the largest or smallest value) and rebuild the heap until the entire array is sorted. Heap Sort is known for its O(n log n) time complexity (stable across all cases) and O(1) auxiliary space (for in-place sorting), making it a memory-efficient alternative to Merge Sort.


Core Concepts: Heap Data Structure

Before diving into Heap Sort, it’s critical to understand heaps:

  • heap is a complete binary tree (all levels filled except possibly the last, which is filled left to right) that satisfies the heap property:
    • Max-Heap: For every parent node, its value is greater than or equal to the values of its child nodes (root = maximum element).
    • Min-Heap: For every parent node, its value is less than or equal to the values of its child nodes (root = minimum element).
  • Heaps are typically represented as arrays (no need for explicit tree nodes):
    • For a node at index i:
      • Left child index: 2*i + 1
      • Right child index: 2*i + 2
      • Parent node index: (i - 1) // 2

Example Max-Heap (Array Representation)

Tree structure:

plaintext

       9
     /   \
    7     6
   / \   /
  3   2 5

Array representation: [9, 7, 6, 3, 2, 5]


Core Principles of Heap Sort

Heap Sort follows two key steps (using a max-heap for ascending sort):

1. Build a Max-Heap from the Unsorted Array

Convert the entire unsorted array into a max-heap. This process (called “heapify-up” or “build-heap”) ensures the root element is the maximum value in the array.

  • How it works: Start from the last non-leaf node (index (n//2 - 1), where n is the array length) and move upward, heapifying each node to maintain the max-heap property.

2. Extract Elements from the Heap One by One

  • Swap the root (max element) with the last element of the heap (the end of the unsorted portion of the array). This moves the max element to its correct sorted position.
  • Reduce the heap size by 1 (exclude the last element, which is now sorted).
  • Heapify the new root (which may violate the max-heap property) to restore the heap structure.
  • Repeat until the heap size is 1 (the entire array is sorted).

Example Walkthrough (Sorting [4, 10, 3, 5, 1] Ascending)

Step 1: Build a Max-Heap

Original array: [4, 10, 3, 5, 1]

  • Last non-leaf node index: (5//2 - 1) = 1 (element 10).
  • Heapify nodes from index 1 downward:
    • Heapify index 1 (10): Already a max-heap (children 5 and 1 are smaller).
    • Heapify index 0 (4): Swap with larger child (10), resulting in [10, 5, 3, 4, 1] (valid max-heap).

Step 2: Extract Elements

  1. Swap root (10) with last element (1): [1, 5, 3, 4, 10] (sorted portion: [10]). Heapify root (1):
    • Swap 1 with larger child (5): [5, 4, 3, 1, 10] (valid max-heap).
  2. Swap root (5) with last element (1): [1, 4, 3, 5, 10] (sorted portion: [5, 10]). Heapify root (1):
    • Swap 1 with larger child (4): [4, 1, 3, 1, 10] (valid max-heap).
  3. Swap root (4) with last element (1): [1, 1, 3, 4, 10] (sorted portion: [4, 5, 10]). Heapify root (1):
    • Swap 1 with larger child (3): [3, 1, 1, 1, 10] (valid max-heap).
  4. Swap root (3) with last element (1): [1, 1, 3, 4, 10] (sorted portion: [3, 4, 5, 10]).
  5. Heap size is 1—sort complete: [1, 3, 4, 5, 10].

Heap Sort Implementation (Python)

1. Standard In-Place Implementation (Max-Heap for Ascending Sort)

python

运行

def heap_sort(arr):
    n = len(arr)
    
    # Step 1: Build a max-heap from the array
    def build_max_heap(arr, n):
        # Start from the last non-leaf node and heapify downward
        start_idx = (n // 2) - 1
        for i in range(start_idx, -1, -1):
            heapify(arr, n, i)
    
    # Heapify a subtree rooted at index i (maintains max-heap property)
    def heapify(arr, n, i):
        largest = i  # Initialize largest as root
        left = 2 * i + 1  # Left child index
        right = 2 * i + 2  # Right child index
        
        # If left child exists and is larger than root
        if left < n and arr[left] > arr[largest]:
            largest = left
        
        # If right child exists and is larger than largest so far
        if right < n and arr[right] > arr[largest]:
            largest = right
        
        # If largest is not root (violation of max-heap property)
        if largest != i:
            arr[i], arr[largest] = arr[largest], arr[i]  # Swap
            # Recursively heapify the affected subtree
            heapify(arr, n, largest)
    
    # Build max-heap
    build_max_heap(arr, n)
    
    # Step 2: Extract elements one by one
    for i in range(n-1, 0, -1):
        arr[i], arr[0] = arr[0], arr[i]  # Swap root (max) with last element
        heapify(arr, i, 0)  # Heapify reduced heap (size i)
    
    return arr

# Test code
if __name__ == "__main__":
    unsorted_arr = [4, 10, 3, 5, 1]
    sorted_arr = heap_sort(unsorted_arr.copy())  # Use copy to preserve original
    print("Original array:", unsorted_arr)
    print("Sorted array:", sorted_arr)  # Output: [1, 3, 4, 5, 10]

2. Min-Heap Implementation (for Descending Sort)

To sort in descending order, use a min-heap and adjust the logic:

python

运行

def heap_sort_descending(arr):
    n = len(arr)
    
    def build_min_heap(arr, n):
        start_idx = (n // 2) - 1
        for i in range(start_idx, -1, -1):
            min_heapify(arr, n, i)
    
    def min_heapify(arr, n, i):
        smallest = i
        left = 2 * i + 1
        right = 2 * i + 2
        
        if left < n and arr[left] < arr[smallest]:
            smallest = left
        if right < n and arr[right] < arr[smallest]:
            smallest = right
        
        if smallest != i:
            arr[i], arr[smallest] = arr[smallest], arr[i]
            min_heapify(arr, n, smallest)
    
    build_min_heap(arr, n)
    
    for i in range(n-1, 0, -1):
        arr[i], arr[0] = arr[0], arr[i]  # Swap root (min) with last element
        min_heapify(arr, i, 0)
    
    return arr

# Test
unsorted_arr = [4, 10, 3, 5, 1]
sorted_desc = heap_sort_descending(unsorted_arr.copy())
print("Descending sorted array:", sorted_desc)  # Output: [10, 5, 4, 3, 1]


Time and Space Complexity

MetricValueExplanation
Time Complexity (Best)O(n log n)Building the heap takes O(n); extracting elements takes O(n log n)
Time Complexity (Worst)O(n log n)Same as best case—no degradation (unlike Quick Sort)
Time Complexity (Average)O(n log n)Consistent logarithmic time due to heapify operations
Space ComplexityO(1) (in-place)Only constant auxiliary space is used; recursive heapify uses O(log n) stack space (can be optimized to iterative for O(1))
StabilityUnstableSwapping elements during heapify can change the relative order of equal elements

Key Note on Heapify Time:

  • Building a heap takes O(n) time (not O(n log n)) because most nodes are leaf nodes (no need to heapify) and deeper nodes require fewer heapify steps.
  • Extracting each element takes O(log n) time (heapify adjusts the heap from root to leaf), and there are n-1 extractions—total O(n log n) time.

Pros and Cons of Heap Sort

Pros

  1. Stable Time Complexity: Always O(n log n) with no worst-case degradation (unlike Quick Sort’s O(n²) in extreme cases).
  2. In-Place Sorting: Uses O(1) auxiliary space (excluding recursive stack), making it memory-efficient for large datasets.
  3. No Additional Data Structures: Implemented directly on arrays—no need for linked lists or other structures.
  4. Useful for Top-K Problems: Heap Sort can be modified to find the top K largest/smallest elements in O(n + K log n) time (more efficient than full sorting for K << n).

Cons

  1. Unstable Sorting: Equal elements may change their relative order (e.g., [2a, 2b, 1] could become [1, 2b, 2a] after sorting).
  2. Higher Constant Overhead: Slower than Quick Sort in practice for most datasets due to the cost of heapify operations (comparisons and swaps).
  3. Poor Cache Performance: Heapify accesses non-consecutive array elements, leading to more cache misses compared to Merge Sort or Quick Sort (which access contiguous elements).

Real-World Applications of Heap Sort

  1. Top-K Problems: Finding the K largest/smallest elements (e.g., top 10 highest scores, K most frequent words) without full sorting.
  2. Priority Queues: Heap Sort is the basis for priority queue implementations (e.g., Python’s heapq module, Java’s PriorityQueue).
  3. Operating Systems: Used in scheduling algorithms (e.g., priority-based task scheduling) to select the highest-priority task efficiently.
  4. External Sorting: Suitable for sorting data larger than memory (similar to Merge Sort) due to its in-place nature and O(n log n) time.
  5. Database Indexing: Used in heap files (a type of database storage) to maintain sorted order for fast lookups.

Summary

Ideal use cases: Memory-constrained environments, top-K problems, and systems requiring consistent performance (no worst-case degradation).

Heap Sort leverages the max-heap/min-heap property and follows two core steps: build a heap → extract elements iteratively.

Key features: O(n log n) stable time complexity + O(1) auxiliary space, but unstable and slower in practice than Quick Sort for most cases.



了解 Ruigu Electronic 的更多信息

订阅后即可通过电子邮件收到最新文章。

Posted in

Leave a comment