Sorting on twodimensional processor arraysSorting on twodimensional processor arrays 
The definition of the sorting problem has to be generalized in order to cope with twodimensional arrays (and moreover, arbitrary arrangements) of data.
Definition: Let A be a set of data with some order relation . A data record is a mapping a : J A, where J is a finite index set. Another writing for a data record a is (a_{i})_{i J} .
Example: A data sequence a_{0}, ..., a_{n1} is a data record with J = {0, ..., n1}.
An n×narray of data is a data record with J = {0, ..., n1} × {0, ..., n1}.
Definition: A sorting order is a bijective mapping ρ : J {0, ..., J1}.
Example: For data sequences ascending order is expressed by ρ(i) = i, descending order by ρ(i) = n – 1 – i.
For n×narrays usually the following sorting orders are considered:
rowmajor:  ρ(i, j) = i·n + j  
snakelike: 

Definition: The sorting problem is defined to reorder a data record (a_{i})_{i J} to a data record (a_{φ(i)})_{i J} such that:
i, j J : a_{φ(i)}a_{φ(j)} for ρ(i) < ρ(j)
Here φ is a permutation of the index set J.
In the following, n×ndata records are to be sorted on an n×nprocessor array. The processors are connected as a mesh, i.e. only adjacent processors can exchange data with each other. The data are associated with the processors in a natural way such that each processor (i, j) contains data element a_{i, j}. The sorting order is rowmajor.
Assume that the smallest element is at the beginning in the lower right corner of the n×narray, then at least 2n – 2 steps are necessary to move it to its place in the upper right corner. This trivial lower bound has been improved by Kunde to 3n – 2n – 3 steps [Kun 87].
In Figure 1 an n×narray is shown. The triangle in the lower right corner is called "joker zone", since data in this zone are left undetermined for the time being. Outside the joker zone the array is filled with all different data.
 
Figure 1: n×narray with joker zone  
The data of the joker zone need at least 2n – 2n – 2 steps to reach the upper left corner of the array. The element x that happens to be in the upper left corner at that time is therefore independent of the data of the joker zone: regardless of the data values of the joker zone, after 2n – 2n – 2 steps there will always be the same element x in the upper left corner.
The data values of the joker zone can therefore be chosen in a way such that element x belongs to the right edge of the array in the sorted order. Since the joker zone contains n elements there is enough mass available to push x to the right edge. The distance that element x still has to move is n1, corresponding to n1 additional steps. This means that altogether at least 3n – 2n – 3 steps are necessary to sort the array.
Already in 1977 Thompson and Kung published a paper on sorting on n×nprocessor arrays [TK 77]. Their algorithm s^{2}way Mergesort is optimal not only asymptotically but also concerning the constant of the highestorder term. The time complexity of the algorithm is 3n + O(n^{2/3} log(n)) comparisonexchange operations.
In the sequel, several more papers appeared that presented O(n)sorting algorithms for n×nprocessor arrays [LSSS 85][SchSh 86][MG 88]. Some of these algorithms are simpler than that of Thompson/Kung, but all are slower. Only the algorithm of Schnorr/Shamir [SchSh 86] also reaches 3n as the highestorder term.
In practice, there is a need for an algorithm that is simple and easy to implement. In the following, three algorithms are presented: LS3 Sort by Lang, Schimmler, Schmeck und Schröder [LSSS 85], 4way mergesort by Schimmler [Schi 87] and rotatesort by Marberg and Gafni [MG 88].
[Kun 87]  M. Kunde: Lower Bounds for Sorting on MeshConnected Architectures. Acta Informatica 24, 121130 (1987)  
[LSSS 85]  H.W. Lang, M. Schimmler, H. Schmeck, H. Schröder: Systolic Sorting on a MeshConnected Network. IEEE Transactions on Computers C34, 7, 652658 (1985)  
[MG 88]  J.M. Marberg, E. Gafni: Sorting in Constant Number of Row and Column Phases on a Mesh. Algorithmica 3, 561572 (1988)  
[Schi 87]  M. Schimmler: Fast Sorting on the Instruction Systolic Array. Bericht Nr. 8709, Institut für Informatik, ChristianAlbrechtsUniversität Kiel (1987)  
[SchSh 86]  C.P. Schnorr, A. Shamir: An Optimal Sorting Algorithm for MeshConnected Computers. Proc. of the 18th ACM Symposium on Theory of Computing, 255261 (1986)  
[TK 77]  C.D. Thompson, H.T. Kung: Sorting on a MeshConnected Parallel Computer. Communications of the ACM, 20, 4, 263271 (1977) 
Next: [LS3 sort] or 
H.W. Lang Hochschule Flensburg lang@hsflensburg.de Impressum © Created: 01.03.1997 Updated: 05.06.2016 