Radix Sort Explained: Step-by-Step Guide and Implementation

Radix Sort

You want to learn about Radix Sort—an efficient non-comparison-based sorting algorithm that sorts data by processing individual digits or characters, right? Unlike comparison-based algorithms (Bubble Sort, Quick Sort), Radix Sort leverages the “radix” (base) of the data (e.g., base 10 for numbers, base 256 for ASCII characters) to group and order elements. It’s especially powerful for sorting large datasets of fixed-length values (like integers, phone numbers, or strings) and often outperforms O(n log n) algorithms in practice for such cases.

I. Core Principles of Radix Sort

Radix Sort follows a “digit-by-digit” sorting approach, using a stable sub-sorting algorithm (typically Counting Sort or Bucket Sort) to process each digit from the least significant digit (LSD) to the most significant digit (MSD) (or vice versa). Here’s the step-by-step workflow:

  1. Determine the Radix: Choose the base of the data (e.g., base 10 for decimal numbers, base 2 for binary, base 26 for letters).
  2. Find the Maximum Number of Digits: Identify the element with the most digits (e.g., for [123, 45, 6], the max digits = 3).
  3. Sort by Each Digit: For each digit position (starting from LSD to MSD):
    • Use a stable sub-sorting algorithm to sort the array based on the current digit.
    • Pad shorter numbers with leading zeros (conceptually) to match the max digit length.
  4. Final Sorted Array: After processing all digit positions, the array is fully sorted.

Key Insight

By using a stable sub-sorting algorithm, Radix Sort preserves the order of elements with the same digit at the current position. This ensures that previous sorting steps (for less significant digits) are not undone when processing more significant digits.

II. Complete Implementation Code (Python)

Below is a practical Radix Sort implementation for decimal integers (base 10) using Counting Sort as the stable sub-sorter. It includes edge case handling and detailed comments:

python

运行

def counting_sort_for_radix(arr, exp):
    """
    Helper function: Stable Counting Sort based on the digit at position 'exp' (10^exp)
    :param arr: Input array to sort
    :param exp: Exponent (10^exp = digit position: 1 for units, 10 for tens, 100 for hundreds, etc.)
    :return: None (modifies the array in-place with stable sorting)
    """
    n = len(arr)
    output = [0] * n  # Temporary output array to store sorted elements
    count = [0] * 10  # Count array for digits 0-9 (base 10)

    # Step 1: Count frequency of each digit at the current 'exp' position
    for num in arr:
        digit = (num // exp) % 10  # Extract the digit at the current position
        count[digit] += 1

    # Step 2: Modify count array to store cumulative frequencies (for stable sorting)
    # This ensures elements with the same digit retain their relative order
    for i in range(1, 10):
        count[i] += count[i - 1]

    # Step 3: Build the output array (traverse from right to left to maintain stability)
    i = n - 1
    while i >= 0:
        digit = (arr[i] // exp) % 10
        output[count[digit] - 1] = arr[i]  # Place element in its correct position
        count[digit] -= 1  # Decrement count for next element with the same digit
        i -= 1

    # Step 4: Copy the sorted output back to the original array
    for i in range(n):
        arr[i] = output[i]

def radix_sort(arr):
    """
    Radix Sort (LSD: Least Significant Digit first) for non-negative integers
    :param arr: List of non-negative integers to sort
    :return: None (modifies the original array in-place)
    """
    # Edge case: Empty array or single-element array
    if len(arr) <= 1:
        return

    # Step 1: Find the maximum number to determine the number of digits
    max_num = max(arr)

    # Step 2: Process each digit position (exp = 10^i, where i is the digit index)
    exp = 1
    while max_num // exp > 0:
        counting_sort_for_radix(arr, exp)
        exp *= 10  # Move to the next digit (units → tens → hundreds → ...)

# Extension: Radix Sort for negative integers (handles both positive and negative)
def radix_sort_with_negatives(arr):
    if len(arr) <= 1:
        return

    # Split array into negative and non-negative parts
    negatives = [-num for num in arr if num < 0]
    non_negatives = [num for num in arr if num >= 0]

    # Sort non-negative part (using standard Radix Sort)
    radix_sort(non_negatives)

    # Sort negative part (sort absolute values, then reverse to get descending order)
    radix_sort(negatives)
    negatives = [-num for num in reversed(negatives)]  # Convert back to negative and reverse

    # Merge sorted negatives and non-negatives
    arr[:] = negatives + non_negatives

# Test cases
if __name__ == "__main__":
    # Test 1: Non-negative integers
    test1 = [170, 45, 75, 90, 802, 24, 2, 66]
    print("Non-negative sort - Before:", test1)
    radix_sort(test1)
    print("Non-negative sort - After:", test1)  # Output: [2, 24, 45, 66, 75, 90, 170, 802]

    # Test 2: With negative integers
    test2 = [170, -45, 75, -90, 802, -24, 2, -66]
    print("\nWith negatives - Before:", test2)
    radix_sort_with_negatives(test2)
    print("With negatives - After:", test2)  # Output: [-90, -66, -45, -24, 2, 24, 75, 170, 802]

    # Test 3: Edge cases (empty array, single element, duplicates)
    test3 = []
    test4 = [42]
    test5 = [5, 3, 8, 3, 2, 7, 3, 9]
    radix_sort(test3)
    radix_sort(test4)
    radix_sort(test5)
    print("\nEmpty array sorted:", test3)  # Output: []
    print("Single element sorted:", test4)  # Output: [42]
    print("Array with duplicates sorted:", test5)  # Output: [2, 3, 3, 3, 5, 7, 8, 9]

    # Test 4: Large numbers (fixed-length digits)
    test6 = [1234, 5678, 9012, 3456, 7890, 2345]
    radix_sort(test6)
    print("Large fixed-length numbers sorted:", test6)  # Output: [1234, 2345, 3456, 5678, 7890, 9012]

III. Key Code Explanations

1. Helper Function: counting_sort_for_radix

  • Purpose: A stable sorting algorithm tailored to sort elements based on a specific digit position (defined by exp).
  • Digit Extraction(num // exp) % 10 extracts the digit at the current position (e.g., exp=10 extracts the tens digit: 456 // 10 % 10 = 5).
  • Stability: By traversing the array from right to left when building the output, we preserve the relative order of elements with the same digit—critical for Radix Sort’s correctness.

2. Main Function: radix_sort (Non-Negative Integers)

  • Max Digit Calculation: Finding max_num determines how many digit positions we need to process (e.g., max_num=802 requires processing units, tens, and hundreds digits).
  • Digit-by-Digit Sorting: The loop iterates over each digit position (exp=1, 10, 100, ...) and applies the stable Counting Sort for each position.

3. Extension: radix_sort_with_negatives

  • Handling Negatives: Splits the array into negative and non-negative parts, sorts the non-negative part normally, and sorts the negatives by their absolute values (then reverses and converts back to negatives to maintain correct order).

IV. Algorithm Performance Analysis

MetricResultExplanation
Time ComplexityO(d * (n + k))– d: Number of digits in the maximum element (e.g., 3 for 100–999).- n: Number of elements.- k: Radix (base, e.g., 10 for decimal).For fixed d and k, this is O(n) (linear time).
Space ComplexityO(n + k)Uses O(n) space for the output array and O(k) space for the count array.
StabilityStableRelies on the stable sub-sorter (Counting Sort) to preserve order of equal elements.
RestrictionsWorks best for fixed-length data (integers, phone numbers, strings). Requires data to have a defined “radix” (e.g., base 10, base 26).

When Radix Sort Outperforms Comparison-Based Algorithms

  • Large Datasets with Fixed-Length Values: For example, sorting 1 million phone numbers (10 digits, base 10) → d=10k=10, time complexity = O(10*(1e6 + 10)) = O(1e7) operations—faster than Quick Sort’s O(1e6 log 1e6) ≈ O(20e6) operations.
  • Data with Small Radix: Base 2 (binary) or base 10 data is processed efficiently, as k is small.

V. Radix Sort vs. Quick Sort vs. Merge Sort

FeatureRadix SortQuick SortMerge Sort
Sorting ApproachNon-comparison (digit-based)Comparison (divide-and-conquer)Comparison (divide-and-conquer)
Time ComplexityO(d*(n+k)) (linear for fixed d/k)O(n log n) (average)O(n log n) (guaranteed)
Space ComplexityO(n + k)O(log n) (in-place)O(n)
StabilityStableUnstableStable
Best ForFixed-length data (integers, strings)General-purpose sortingStable sorting, linked lists
RestrictionsRequires defined radix/fixed lengthNo restrictionsNo restrictions

Summary

Tradeoff: Uses more auxiliary space (O(n + k)) than in-place Quick Sort, and is less flexible for arbitrary data types.

Radix Sort is a non-comparison-based algorithm that sorts data by processing digits/characters from LSD to MSD (or vice versa), using a stable sub-sorter (e.g., Counting Sort).

Key advantage: Linear time complexity (O(n)) for fixed-length data, outperforming O(n log n) comparison-based algorithms for large datasets.

Key requirements: Data must have a defined radix (base) and fixed length (or manageable digit count).

Practical use cases: Sorting integers, phone numbers, postal codes, strings, or any data with uniform structure.



了解 Ruigu Electronic 的更多信息

订阅后即可通过电子邮件收到最新文章。

Posted in

Leave a comment