Pigeonhole sort

Pigeonhole sorting is a sorting algorithm that is suitable for sorting lists of elements where the number of elements (n) and the length of the range of possible key values (N) are approximately the same.[1] It requires O(n + N) time. It is similar to counting sort, but differs in that it "moves items twice: once to the bucket array and again to the final destination [whereas] counting sort builds an auxiliary array then uses the array to compute each item's final destination and move the item there."[2]

Pigeonhole sort
ClassSorting algorithm
Data structureArray
Worst-case performance, where N is the range of key values and n is the input size
Worst-case space complexity

The pigeonhole algorithm works as follows:

  1. Given an array of values to be sorted, set up an auxiliary array of initially empty "pigeonholes", one pigeonhole for each key in the range of the keys in the original array.
  2. Going over the original array, put each value into the pigeonhole corresponding to its key, such that each pigeonhole eventually contains a list of all values with that key.
  3. Iterate over the pigeonhole array in increasing order of keys, and for each pigeonhole, put its elements into the original array in increasing order.

Example

Suppose one were sorting these value pairs by their first element:

  • (5, "hello")
  • (3, "pie")
  • (8, "apple")
  • (5, "king")

For each value between 3 and 8 we set up a pigeonhole, then move each element to its pigeonhole:

  • 3: (3, "pie")
  • 4:
  • 5: (5, "hello"), (5, "king")
  • 6:
  • 7:
  • 8: (8, "apple")

The pigeonhole array is then iterated over in order, and the elements are moved back to the original list.

The difference between pigeonhole sort and counting sort is that in counting sort, the auxiliary array does not contain lists of input elements, only counts:

  • 3: 1
  • 4: 0
  • 5: 2
  • 6: 0
  • 7: 0
  • 8: 1

For arrays where N is much larger than n, bucket sort is a generalization that is more efficient in space and time.

Python implementation

def pigeonhole_sort(lst):
  """
  Sort list of (key, value) pairs by key.
  """
  base = min(key for key, value in lst)
  size = max(key for key, value in lst) - base + 1

  pigeonholes = [[] for _ in range(size)]

  for key, value in lst:
    i = key - base
    pigeonhole = pigeonholes[i]
    pigeonhole.append((key, value))

  i = 0
  for pigeonhole in pigeonholes:
    for element in pigeonhole:
      lst[i] = element
      i += 1
gollark: Inspired by CRUX, another minimalist distribution, Judd Vinet started the Arch Linux project in March 2002. The name was chosen because Vinet liked the word's meaning of "the principal," as in "arch-enemy".
gollark: Arch Linux has comprehensive documentation, which consists of a community wiki known as the ArchWiki.
gollark: Pacman, a package manager written specifically for Arch Linux, is used to install, remove and update software packages. Arch Linux uses a rolling release model, meaning there are no "major releases" of completely new versions of the system; a regular system update is all that is needed to obtain the latest Arch software; the installation images released every month by the Arch team are simply up-to-date snapshots of the main system components.
gollark: Arch Linux is a Linux distribution created for computers with x86-64 processors. Arch Linux adheres to the KISS principle ("Keep It Simple, Stupid"). The project attempts to have minimal distribution-specific changes, and therefore minimal breakage with updates, and be pragmatic over ideological design choices and focus on customizability rather than user-friendliness.
gollark: By the way, I use Arch.

See also

References

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.