fbpx
Wikipedia

Cache-oblivious distribution sort

The cache-oblivious distribution sort is a comparison-based sorting algorithm. It is similar to quicksort, but it is a cache-oblivious algorithm, designed for a setting where the number of elements to sort is too large to fit in a cache where operations are done. In the external memory model, the number of memory transfers it needs to perform a sort of items on a machine with cache of size and cache lines of length is , under the tall cache assumption that . This number of memory transfers has been shown to be asymptotically optimal for comparison sorts. This distribution sort also achieves the asymptotically optimal runtime complexity of .

Algorithm edit

Overview edit

Distribution sort operates on a contiguous array of   elements. To sort the elements, it performs the following:

  1. Partition the array into   contiguous subarrays of size  , and recursively sort each subarray.
  2. Distribute the elements of the sorted subarrays into   buckets   each of size at most   such that for every i from 1 to q-1, every element of bucket   is not larger than any element in   This distribution step is the main step of this algorithm, and is covered in more detail below.
  3. Recursively sort each bucket.
  4. Output the concatenation of the buckets.


Distribution step edit

As mentioned in step 2 above, the goal of the distribution step is to distribute the sorted subarrays into q buckets   The distribution step algorithm maintains two invariants. The first is that each bucket has size at most   at any time, and any element in bucket   is no larger than any element in bucket   The second is that every bucket has an associated pivot, a value which is greater than all elements in the bucket.

Initially, the algorithm starts with one empty bucket with pivot  . As it fills buckets, it creates new buckets by splitting a bucket into two when it would be made overfull (by having at least   elements placed into it). The split is done by performing the linear time median finding algorithm, and partitioning based on this median. The pivot of the lower bucket will be set to the median found, and the pivot of the higher bucket will be set to the same as the bucket before the split. At the end of the distribution step, all elements are in the buckets, and the two invariants will still hold.

To accomplish this, each subarray and bucket will have a state associated with it. The state of a subarray consists of an index next of the next element to be read from the subarray, and a bucket number bnum indicating which bucket index the element should be copied to. By convention,   if all elements in the subarray have been distributed. (Note that when we split a bucket, we have to increment all bnum values of all subarrays whose bnum value is greater than the index of the bucket that is split.) The state of a bucket consists of the value of the bucket's pivot, and the number of elements currently in the bucket.

Consider the follow basic strategy: iterate through each subarray, attempting to copy over its element at position next. If the element is smaller than the pivot of bucket bnum, then place it in that bucket, possibly incurring a bucket split. Otherwise, increment bnum until a bucket whose pivot is large enough is found. Though this correctly distributes all elements, it does not exhibit a good cache performance.

Instead, the distribution step is performed in a recursive divide-and-conquer. The step will be performed as a call to the function distribute, which takes three parameters i, j, and m. distribute(i, j, m) will distribute elements from the i-th through (i+m-1)-th subarrays into buckets, starting from  . It requires as a precondition that each subarray r in the range   has its  . The execution of distribute(i, j, m) will guarantee that each  . The whole distribution step is distribute . Pseudocode for the implementation of distribute is shown below:

def distribute(i, j, m: int) -> None:  """Distribute through recursive divide-and-conquer.""" if m == 1: copy_elems(i, j) else: distribute(i, j, m / 2) distribute(i + m / 2, j, m / 2) distribute(i, j + m / 2, m / 2) distribute(i + m / 2, j + m / 2, m / 2) 

The base case, where m=1, has a call to the subroutine copy_elems. In this base case, all elements from subarray i that belong to bucket j are added at once. If this leads to bucket j having too many elements, it splits the bucket with the procedure described beforehand.

See also edit

References edit

  • Harald Prokop. Cache-Oblivious Algorithms. Masters thesis, MIT. 1999.

cache, oblivious, distribution, sort, this, article, multiple, issues, please, help, improve, discuss, these, issues, talk, page, learn, when, remove, these, template, messages, this, article, relies, excessively, references, primary, sources, please, improve,. This article has multiple issues Please help improve it or discuss these issues on the talk page Learn how and when to remove these template messages This article relies excessively on references to primary sources Please improve this article by adding secondary or tertiary sources Find sources Cache oblivious distribution sort news newspapers books scholar JSTOR May 2014 Learn how and when to remove this template message This article may be too technical for most readers to understand Please help improve it to make it understandable to non experts without removing the technical details May 2014 Learn how and when to remove this template message Learn how and when to remove this template message The cache oblivious distribution sort is a comparison based sorting algorithm It is similar to quicksort but it is a cache oblivious algorithm designed for a setting where the number of elements to sort is too large to fit in a cache where operations are done In the external memory model the number of memory transfers it needs to perform a sort of N displaystyle N items on a machine with cache of size Z displaystyle Z and cache lines of length L displaystyle L is O NLlogZ N displaystyle O frac N L log Z N under the tall cache assumption that Z W L2 displaystyle Z Omega L 2 This number of memory transfers has been shown to be asymptotically optimal for comparison sorts This distribution sort also achieves the asymptotically optimal runtime complexity of 8 Nlog N displaystyle Theta N log N Contents 1 Algorithm 1 1 Overview 1 2 Distribution step 2 See also 3 ReferencesAlgorithm editOverview edit Distribution sort operates on a contiguous array of N displaystyle N nbsp elements To sort the elements it performs the following Partition the array into N displaystyle sqrt N nbsp contiguous subarrays of size N displaystyle sqrt N nbsp and recursively sort each subarray Distribute the elements of the sorted subarrays into q N displaystyle q leq sqrt N nbsp buckets B1 B2 Bq displaystyle B 1 B 2 ldots B q nbsp each of size at most 2N displaystyle 2 sqrt N nbsp such that for every i from 1 to q 1 every element of bucket Bi displaystyle B i nbsp is not larger than any element in Bi 1 displaystyle B i 1 nbsp This distribution step is the main step of this algorithm and is covered in more detail below Recursively sort each bucket Output the concatenation of the buckets Distribution step edit As mentioned in step 2 above the goal of the distribution step is to distribute the sorted subarrays into q buckets B1 B2 Bq displaystyle B 1 B 2 ldots B q nbsp The distribution step algorithm maintains two invariants The first is that each bucket has size at most 2N displaystyle 2 sqrt N nbsp at any time and any element in bucket Bi displaystyle B i nbsp is no larger than any element in bucket Bi 1 displaystyle B i 1 nbsp The second is that every bucket has an associated pivot a value which is greater than all elements in the bucket Initially the algorithm starts with one empty bucket with pivot displaystyle infty nbsp As it fills buckets it creates new buckets by splitting a bucket into two when it would be made overfull by having at least 2N 1 displaystyle 2 sqrt N 1 nbsp elements placed into it The split is done by performing the linear time median finding algorithm and partitioning based on this median The pivot of the lower bucket will be set to the median found and the pivot of the higher bucket will be set to the same as the bucket before the split At the end of the distribution step all elements are in the buckets and the two invariants will still hold To accomplish this each subarray and bucket will have a state associated with it The state of a subarray consists of an index next of the next element to be read from the subarray and a bucket number bnum indicating which bucket index the element should be copied to By convention bnum displaystyle bnum infty nbsp if all elements in the subarray have been distributed Note that when we split a bucket we have to increment all bnum values of all subarrays whose bnum value is greater than the index of the bucket that is split The state of a bucket consists of the value of the bucket s pivot and the number of elements currently in the bucket Consider the follow basic strategy iterate through each subarray attempting to copy over its element at position next If the element is smaller than the pivot of bucket bnum then place it in that bucket possibly incurring a bucket split Otherwise increment bnum until a bucket whose pivot is large enough is found Though this correctly distributes all elements it does not exhibit a good cache performance Instead the distribution step is performed in a recursive divide and conquer The step will be performed as a call to the function distribute which takes three parameters i j and m distribute i j m will distribute elements from the i th through i m 1 th subarrays into buckets starting from Bj displaystyle B j nbsp It requires as a precondition that each subarray r in the range i i m 1 displaystyle i ldots i m 1 nbsp has its bnum r j displaystyle bnum r geq j nbsp The execution of distribute i j m will guarantee that each bnum r j m displaystyle bnum r geq j m nbsp The whole distribution step is distribute 1 1 N displaystyle 1 1 sqrt N nbsp Pseudocode for the implementation of distribute is shown below def distribute i j m int gt None Distribute through recursive divide and conquer if m 1 copy elems i j else distribute i j m 2 distribute i m 2 j m 2 distribute i j m 2 m 2 distribute i m 2 j m 2 m 2 The base case where m 1 has a call to the subroutine copy elems In this base case all elements from subarray i that belong to bucket j are added at once If this leads to bucket j having too many elements it splits the bucket with the procedure described beforehand See also editCache oblivious algorithm Funnelsort External sortingReferences editHarald Prokop Cache Oblivious Algorithms Masters thesis MIT 1999 Retrieved from https en wikipedia org w index php title Cache oblivious distribution sort amp oldid 1183011430, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.