Counting Sort Explained: Pros, Cons, and Use Cases

You want to learn about Counting Sort—a non-comparison-based sorting algorithm that excels at sorting integers (or values with a known, limited range) by counting the frequency of each unique element. Its core idea is to: 1) count how many times each element appears in the input, 2) use these counts to determine the position of each element in the sorted output, and 3) construct the sorted array.

Unlike comparison-based algorithms (e.g., Quick Sort, Merge Sort), Counting Sort achieves O(n + k) time complexity (where n is the number of elements and k is the range of input values), making it far faster for datasets with a small, known value range. However, it requires additional space proportional to k, so it’s not ideal for data with a very large value range.


Core Principles of Counting Sort

Counting Sort works best when the input values are non-negative integers (easily adaptable to negative numbers) and the range k = max(input) - min(input) + 1 is small relative to n. Here’s the step-by-step process (for non-negative integers):

1. Find the Range of Input Values

Calculate the minimum (min_val) and maximum (max_val) values in the input array to determine the size of the count array:

k = max_val - min_val + 1

2. Count Frequencies of Each Element

Create a count array of size k initialized to 0. Iterate through the input array and increment the count of each element (adjusting for min_val to map values to valid array indices).

3. Compute Prefix Sums (Optional but Critical for Stability)

To maintain stability (preserve the relative order of equal elements), convert the count array into a prefix sum array. Each entry count[i] now represents the number of elements ≤ the value corresponding to i—this tells us the position of the last occurrence of that value in the sorted array.

4. Build the Sorted Output Array

Iterate over the input array from right to left (to maintain stability), use the prefix sum array to find the correct position of each element in the output array, and decrement the corresponding prefix sum to handle duplicates.

Example Walkthrough (Sorting [4, 2, 2, 8, 3, 3, 1])

Step 1: Find Range

min_val = 1max_val = 8 → k = 8 - 1 + 1 = 8

Step 2: Count Frequencies

count array (index 0 → 1, index 1 → 2, …, index 7 → 8):

count = [1, 2, 2, 0, 1, 0, 0, 1]

(Explanation: 1 appears 1x, 2 appears 2x, 3 appears 2x, 4 appears 1x, 8 appears 1x)

Step 3: Compute Prefix Sums

count = [1, 3, 5, 5, 6, 6, 6, 7]

(Explanation: 1 element ≤1, 3 elements ≤2, 5 elements ≤3, etc.)

Step 4: Build Sorted Array

Iterate input from right to left ([1, 3, 3, 8, 2, 2, 4]):

  • 1 → count[0] = 1 → position 0 → sorted[0] = 1 → count[0] = 0
  • 3 → count[2] = 5 → position 4 → sorted[4] = 3 → count[2] = 4
  • 3 → count[2] = 4 → position 3 → sorted[3] = 3 → count[2] = 3
  • 8 → count[7] = 7 → position 6 → sorted[6] = 8 → count[7] = 6
  • 2 → count[1] = 3 → position 2 → sorted[2] = 2 → count[1] = 2
  • 2 → count[1] = 2 → position 1 → sorted[1] = 2 → count[1] = 1
  • 4 → count[3] = 5 → position 5 → sorted[5] = 4 → count[3] = 4

Final sorted array: [1, 2, 2, 3, 3, 4, 8]


Counting Sort Implementations (Python)

1. Basic Implementation (Non-Stable, Non-Negative Integers)

Simple version for non-negative integers—does not preserve the order of equal elements:

python

运行

def counting_sort_basic(arr):
    if not arr:
        return arr
    
    # Step 1: Find min and max values to determine count array size
    min_val = min(arr)
    max_val = max(arr)
    k = max_val - min_val + 1
    
    # Step 2: Count frequencies of each element
    count = [0] * k
    for num in arr:
        count[num - min_val] += 1  # Map num to count array index
    
    # Step 3: Build sorted array from count array
    sorted_arr = []
    for i in range(k):
        sorted_arr.extend([i + min_val] * count[i])  # Repeat element count[i] times
    
    return sorted_arr

# Test code
if __name__ == "__main__":
    unsorted_arr = [4, 2, 2, 8, 3, 3, 1]
    sorted_arr = counting_sort_basic(unsorted_arr)
    print("Original array:", unsorted_arr)
    print("Basic sorted array:", sorted_arr)  # Output: [1, 2, 2, 3, 3, 4, 8]

2. Stable Implementation (Preserves Order of Equal Elements)

Uses prefix sums to maintain stability—critical for use cases like sorting objects by a key:

python

运行

def counting_sort_stable(arr):
    if not arr:
        return arr
    
    # Step 1: Find min and max values
    min_val = min(arr)
    max_val = max(arr)
    k = max_val - min_val + 1
    n = len(arr)
    
    # Step 2: Count frequencies
    count = [0] * k
    for num in arr:
        count[num - min_val] += 1
    
    # Step 3: Compute prefix sums (to get positions)
    for i in range(1, k):
        count[i] += count[i - 1]  # count[i] = number of elements ≤ (i + min_val)
    
    # Step 4: Build sorted array (right to left for stability)
    sorted_arr = [0] * n
    for num in reversed(arr):
        # Calculate position: count[num - min_val] - 1 (0-based index)
        pos = count[num - min_val] - 1
        sorted_arr[pos] = num
        count[num - min_val] -= 1  # Decrement for next occurrence of the same number
    
    return sorted_arr

# Test
unsorted_arr = [4, 2, 2, 8, 3, 3, 1]
stable_sorted = counting_sort_stable(unsorted_arr)
print("Stable sorted array:", stable_sorted)  # Output: [1, 2, 2, 3, 3, 4, 8]

3. Handling Negative Integers

Easily adapted to negative numbers by adjusting the mapping to the count array:

python

运行

def counting_sort_negative(arr):
    if not arr:
        return arr
    
    # Step 1: Find min and max (handles negatives)
    min_val = min(arr)
    max_val = max(arr)
    k = max_val - min_val + 1
    
    # Step 2: Count frequencies (offset negative numbers to non-negative indices)
    count = [0] * k
    for num in arr:
        count[num - min_val] += 1  # e.g., -3 → index 0 if min_val = -3
    
    # Step 3: Build sorted array
    sorted_arr = []
    for i in range(k):
        sorted_arr.extend([i + min_val] * count[i])
    
    return sorted_arr

# Test with negative numbers
unsorted_neg = [-5, 2, -2, 0, -5, 1]
sorted_neg = counting_sort_negative(unsorted_neg)
print("Sorted negative array:", sorted_neg)  # Output: [-5, -5, -2, 0, 1, 2]

4. Sorting Objects by a Numeric Key (Stable)

Counting Sort is often used to sort objects by a numeric key (e.g., student IDs, ages)—stability ensures objects with the same key retain their original order:

python

运行

def counting_sort_by_key(arr, key_func):
    """
    Sort objects by a numeric key (e.g., key_func = lambda x: x.age)
    """
    if not arr:
        return arr
    
    # Step 1: Extract keys and find their range
    keys = [key_func(obj) for obj in arr]
    min_key = min(keys)
    max_key = max(keys)
    k = max_key - min_key + 1
    n = len(arr)
    
    # Step 2: Count key frequencies
    count = [0] * k
    for key in keys:
        count[key - min_key] += 1
    
    # Step 3: Prefix sums for positions
    for i in range(1, k):
        count[i] += count[i - 1]
    
    # Step 4: Build sorted array (stable)
    sorted_arr = [None] * n
    for obj in reversed(arr):
        key = key_func(obj)
        pos = count[key - min_key] - 1
        sorted_arr[pos] = obj
        count[key - min_key] -= 1
    
    return sorted_arr

# Example: Sort students by age (stable)
class Student:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    def __repr__(self):
        return f"({self.name}, {self.age})"

students = [Student("Alice", 22), Student("Bob", 19), Student("Charlie", 22), Student("Diana", 19)]
sorted_students = counting_sort_by_key(students, key_func=lambda x: x.age)
print("Sorted by age (stable):", sorted_students)
# Output: [(Bob, 19), (Diana, 19), (Alice, 22), (Charlie, 22)] (preserves order of same ages)


Time and Space Complexity

MetricValueExplanation
Time ComplexityO(n + k)O(n) for counting frequencies + O(k) for building the sorted array; k = range of values
Space ComplexityO(n + k)O(k) for the count array + O(n) for the output array (stable version); basic version uses O(k)
StabilityStable (if implemented correctly)Preserves the relative order of equal elements (critical for object sorting)
AdaptabilityWorks for integers (positive/negative) and objects with numeric keysNot suitable for non-numeric data (e.g., strings) without key mapping

Key Notes:

  • When is Counting Sort Fast? When k is small relative to n (e.g., sorting exam scores 0–100 for 10,000 students: k=101n=10,000 → O(10,101) time).
  • When is Counting Sort Slow? When k is much larger than n (e.g., sorting 100 random integers between 0 and 1,000,000 → k=1,000,001 → O(1,000,100) time, which is worse than O(n log n) algorithms).

Pros and Cons of Counting Sort

Pros

  1. Blazing Fast for Small Value Ranges: O(n + k) time complexity outperforms all comparison-based algorithms (O(n log n)) when k is small.
  2. Stable Sorting: Preserves the order of equal elements—essential for sorting objects by a key (e.g., secondary sorting).
  3. Simple to Implement: No complex recursion or divide-and-conquer logic; straightforward for integer data.
  4. No Comparison Overhead: Avoids the O(log n) comparison cost of algorithms like Quick Sort or Merge Sort.

Cons

  1. Requires Known Value Range: Needs prior knowledge of min_val and max_val (or must compute them, adding O(n) time).
  2. Poor for Large Value Ranges: High space complexity (O(k)) when k is large (e.g., sorting 64-bit integers is impractical).
  3. Limited to Numeric Data: Works best for integers; requires mapping non-numeric data to numeric keys (e.g., strings to ASCII values).
  4. Not In-Place: The stable version uses O(n) extra space for the output array (unlike Heap Sort or Selection Sort).

Real-World Applications of Counting Sort

  1. Sorting Small-Range Integers: Exam scores (0–100), ages (0–120), or ratings (1–5).
  2. Radix Sort Subroutine: Counting Sort is the most common subroutine for Radix Sort (a non-comparison-based algorithm for large integers).
  3. Database Sorting: Sorting records by numeric keys (e.g., user IDs, timestamps) where stability is required.
  4. Image Processing: Sorting pixel values (0–255 for 8-bit images) for tasks like histogram equalization.
  5. Counting Frequencies: Beyond sorting, Counting Sort is often used to count element frequencies (e.g., survey responses, log data).

Summary

Best use cases: Small-range integers, sorting objects by numeric keys, and as a subroutine for Radix Sort.

Counting Sort is a non-comparison-based algorithm that sorts by counting element frequencies and using prefix sums to determine positions.

Key features: O(n + k) time complexitystable (when implemented correctly), but requires O(n + k) space and works best for small value ranges.



了解 Ruigu Electronic 的更多信息

订阅后即可通过电子邮件收到最新文章。

Posted in

Leave a comment