Sorting on two dimensional processor arraysAlgorithm 4way mergesort 
In practice, there is a need for an algorithm that is simple and easy to implement. In the following, the algorithm 4way mergesort of Schimmler [Schi 87] is presented.
Definition: An m×narray of data is called roughly sorted, if sorting of the rows suffices to sort the array completely.
In a roughly sorted array each data element is already in its proper row. The idea of 4way mergesort is to merge four roughly sorted k/2×k/2arrays to one roughly sorted k×karray.
Procedure 4way merge (k)  
Input:  k×karray, whose four k/2×k/2subarrays are roughly sorted 
Output:  the roughly sorted k×karray 
Method: 

The correctness of procedure 4way merge can be shown in an illustrative way by the 01principle. (The 01principle is also valid for the term "roughly sorted", since by subsequent sorting of the rows a sorted array can be obtained.)
Starting point is an array containing zeroes and ones, whose four subarrays are roughly sorted. In Figure 2(a) this situation is illustrated (zeroes are drawn white, ones gray).
Sorting the rows of the subarrays in step 1 results in a situation similar to Figure 2(b).
Sorting the columns in step 2 yields two roughly sorted k×k/2arrays (Figure 2(c)).
After sorting the rows in alternating order in step 3 two different situations are possible: either the "dirty" halfrows (i.e. halfrows in which zeroes and ones occur) of situation (c) are in different halves (Figure 2(d)) or in the same half of the array (in the right half in Figure 2(e)). This depends on the number of full 1halfrows in situation (c). In both cases however, sorting of the columns in step 4 leads to the situation shown in Figure 2(f), namely a roughly sorted k×karray.
 
Figure 1: Proof of the correctness of procedure 4way merge using the 01principle  
By recursive application of procedure 4way merge a totally unsorted array is roughly sorted:
Procedure roughsort (k)  
Input:  unsorted k×karray 
Output:  roughly sorted k×karray 
Method: 

An n×narray is sorted by roughly sorting it and then sorting the rows:
Algorithm 4way mergesort (n)  
Input:  unsorted n×narray 
Output:  sorted n×narray 
Method: 

Following from the 01principle algorithm 4way mergesort sorts every n×narray filled with arbitrary data, since it sorts all 01arrays.
Sorting a row or a column of length k with oddeven transposition sort takes k steps. In procedure 4way merge this amounts to the following:
step 1:  k/2  steps  
step 2:  k  steps  
step 3:  k  steps  
step 4:  k/2  steps  
altogether:  3k  steps 
In step 4 only k/2 steps of oddeven transposition sort are necessary, since each column consists of two sorted subsequences shuffled into each other.
The recursive execution of 4way merge in roughsort takes 3n + 3n/2 + 3n/4 + ... + 3 6n steps. Thus, including n steps for the subsequent sorting of the rows 4way mergesort has a time complexity of
T(n) 7n
The following applet shows how 4way mergesort sorts a 32×32array of 0's and 1's. For more clarity, operations that are performed in parallel on a twodimensional processor array are shown sequentially here.
Algorithm 4way mergesort for sorting an n×narray has been presented. It has the same asymptotic complexity as the algorithm of Thompson/Kung [TK 77], namely O(n). Concerning the constant it is slower by a factor of about 2. However, its quality lies in its simplicity. An implementation of the algorithm on an instruction systolic array is given in [Schi 87] and in [Lan 90].
[Lan 90]  H.W. Lang: Das befehlssystolische Prozessorfeld  Architektur und Programmierung. Dissertation, ChristianAlbrechtsUniversität Kiel (1990)  
[Schi 87]  M. Schimmler: Fast Sorting on the Instruction Systolic Array. Bericht Nr. 8709, Institut für Informatik, ChristianAlbrechtsUniversität Kiel (1987)  
[TK 77]  C.D. Thompson, H.T. Kung: Sorting on a MeshConnected Parallel Computer. Communications of the ACM, 20, 4, 263271 (1977) 
Next: [Rotatesort] or 
H.W. Lang Hochschule Flensburg lang@hsflensburg.de Impressum © Created: 01.03.1997 Updated: 09.10.2016 