All nearest smaller values

In computer science, the all nearest smaller values problem is the following task: for each position in a sequence of numbers, search among the previous positions for the last position that contains a smaller value. This problem can be solved efficiently both by parallel and non-parallel algorithms: Berkman, Schieber & Vishkin (1993), who first identified the procedure as a useful subroutine for other parallel programs, developed efficient algorithms to solve it in the Parallel Random Access Machine model; it may also be solved in linear time on a non-parallel computer using a stack-based algorithm. Later researchers have studied algorithms to solve it in other models of parallel computation.

Example

Suppose that the input is the binary van der Corput sequence

0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15.

The first element of the sequence (0) has no previous value. The nearest (only) smaller value previous to 8 and to 4 is 0. All three values previous to 12 are smaller, but the nearest one is 4. Continuing in the same way, the nearest previous smaller values for this sequence (indicating the nonexistence of a previous smaller value by a dash) are

—, 0, 0, 4, 0, 2, 2, 6, 0, 1, 1, 5, 1, 3, 3, 7.

In most applications, the positions of the nearest smaller values, and not the values themselves, should be computed, and in many applications the same computation should be computed for the reversal of the sequence in order to find the following smaller value that is closest in the sequence.

Applications

Berkman, Schieber & Vishkin (1993) mention many other problems that may be solved efficiently in parallel using a nearest smaller values computation. Among them, they include the following:

  • Merge algorithms, computing the merge step of a merge sort. The input to these algorithms consists of two sorted arrays of numbers; the desired output is the same set of numbers in a single sorted array. If one concatenates the two sorted arrays, the first in ascending order and the second in descending order, then the predecessor of each value in the output is either its closest previous smaller value or its closest following smaller value (whichever of the two is larger), and the position of each value in the sorted output array may easily be calculated from the positions of these two nearest smaller values.
  • Construction of Cartesian trees. A Cartesian tree is a data structure introduced by Vuillemin (1980) and further studied by Gabow, Bentley & Tarjan (1984) for range searching applications. Cartesian trees also arise in the definition of the treap and randomized binary search tree data structures for binary searching. The Cartesian tree of a sequence of values has a node for each value. The root of the tree is the minimum value of the sequence; for every other node, the parent of the node is either its closest previous smaller value or its closest following smaller value (whichever of the two exists and is larger). Thus, Cartesian trees may be constructed in linear time based on an all nearest smaller values algorithm.
  • Matching parentheses. If a sequence of open and close parenthesis characters is given as input, together with the nesting depth of each parenthesis, then the match to each open parenthesis is the next close parenthesis with no larger nesting depth, so it can be found by an all nearest smaller values computation that breaks ties in favor of close parentheses. If the nesting depths are not given, they can be calculated using a prefix sum computation.

Similar techniques may also be applied to problems of polygon triangulation, convex hull construction (parallelizing the sequential Graham scan convex hull algorithm), reconstruction of trees from two of the trees' traversal orderings, and quadtree construction.[1]

Sequential algorithm

On a sequential computer, it is straightforward to compute all nearest smaller values using a stack data structure: one processes the values in sequence order, using the stack to maintain a subsequence of the values that have been processed so far and are smaller than any later value that has already been processed. In pseudocode, the algorithm is as follows.

S = new empty stack data structure
for x in the input sequence do
    while S is nonempty and the top element of S is greater than or equal to x do
        pop S
    if S is empty then
        x has no preceding smaller value
    else
        the nearest smaller value to x is the top element of S
    push x onto S

Despite having a nested loop structure, the running time of this algorithm is linear, because every iteration of the inner loop removes an item that had been added in some previous iteration of the outer loop. It is closely related to an algorithm of Knuth for sorting with a stack (for inputs that can be sorted in this way).[2]

An even simpler linear-time sequential algorithm (Barbay, Fischer & Navarro (2012), Lemma 1) does not even need a stack; it assumes that the input sequence is given as an array A[1,n], and stores the index j of the preceding smaller value of the i'th value A[i] in P[i]. We assume an artificial overall minimum at A[0]:

for i from 1 to n:
    j = i-1
    while A[j] >= A[i]:
        j = P[j]
    P[i] = j

Parallel algorithms

Berkman, Schieber & Vishkin (1993) showed how to solve the all nearest smaller values problem efficiently on a concurrent-read concurrent-write Parallel Random Access Machine. For a sequence of n values, stored as an array, they show that the problem may be solved in time O(log log n) using a linear amount of total work. For sequences where all values are integers in the interval [1,s], Berkman, Matias & Ragde (1998) improved this bound to O(log log log s); they also showed that, for sufficiently large values of s, the previous doubly logarithmic time bound is the best that can be achieved for the problem. Since this work, parallel algorithms for the all nearest smaller values problem have also been developed on other models of parallel computation, including parallel computers with a hypercube-structured communications network,[3] and the bulk synchronous parallel model.[4]

Notes

  1. Bern, Eppstein & Teng (1999).
  2. Knuth, Donald (1968), "Vol. 1: Fundamental Algorithms", The Art of Computer Programming, Reading, Mass.: Addison-Wesley.
  3. Kravets & Plaxton (1996).
  4. He & Huang (2001).
gollark: Yes they do.
gollark: Yep.
gollark: AMD's new EPYC CPUs have a Cortex-A5 core in them for something or other, you know. ARM is *everywhere*.
gollark: I don't actually *need* more than a few slow ARM cores.
gollark: I'm happy with my phone's Snapdragon 430, apart from the security problems.

References

  • Barbay, Jeremy; Fischer, Johannes; Navarro, Gonzalo (2012), "LRM-Trees: Compressed indices, adaptive sorting, and compressed permutations", Theoretical Computer Science, 459: 26–41, arXiv:1009.5863, doi:10.1016/j.tcs.2012.08.010.
  • Berkman, Omer; Matias, Yossi; Ragde, Prabhakar (1998), "Triply-logarithmic parallel upper and lower bounds for minimum and range minima over small domains", Journal of Algorithms, 28 (2): 197–215, doi:10.1006/jagm.1997.0905.
  • Berkman, Omer; Schieber, Baruch; Vishkin, Uzi (1993), "Optimal doubly logarithmic parallel algorithms based on finding all nearest smaller values", Journal of Algorithms, 14 (3): 344–370, doi:10.1006/jagm.1993.1018.
  • Bern, Marshall; Eppstein, David; Teng, Shang-Hua (1999), "Parallel construction of quadtrees and quality triangulations" (PDF), International Journal of Computational Geometry & Applications, World Scientific Publishing Company, 9 (6), doi:10.1142/S0218195999000303.
  • Gabow, Harold N.; Bentley, Jon Louis; Tarjan, Robert E. (1984), "Scaling and related techniques for geometry problems", STOC '84: Proc. 16th ACM Symp. Theory of Computing, New York, NY, USA: ACM, pp. 135–143, doi:10.1145/800057.808675, ISBN 0-89791-133-4.
  • He, Xin; Huang, Chun-Hsi (2001), "Communication efficient BSP algorithm for all nearest smaller values", Journal of Parallel and Distributed Computing, 61 (10): 1425–1438, doi:10.1006/jpdc.2001.1741.
  • Kravets, D.; Plaxton, C. G. (1996), "All nearest smaller values on the hypercube", IEEE Trans. Parallel and Distributed Systems, 7 (5): 456–462, doi:10.1109/71.503770.
  • Vuillemin, Jean (1980), "A unifying look at data structures", Commun. ACM, New York, NY, USA: ACM, 23 (4): 229–239, doi:10.1145/358841.358852.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.