Instruction systolic array (ISA)

Communication primitives

 up

Essentially, the ISA is nothing else than a pipelined SIMD array, and from this point of view it is easy to understand an ISA program, if one thinks of it as a sequence of isolated sweeps of instructions. There is an exception where this does not work, namely when communication among the processors occurs. Then the actual timing of the execution of the instructions becomes essential. However, the operations realized by these communication instructions, as e.g. broadcast or ring shift, may again be viewed as sweeps.

It is an interesting fact that broadcast and ring shift operations can be performed with a constant number of instructions although there are no global wires or wrap-around connections.

Broadcast

In order to accomplish data communication between two adjacent processors P and Q, processor P writes a data item x into its communication register and, in the next instruction cycle, processor Q reads x from P's communication register.

Processor Q can write x into its own communication register so that another processor R adjacent to Q can read x again one instruction cycle later, and so on.

The point is that only one instruction is needed in order to propagate x in this way from left to right along a row of processors. This instruction is C:=CW (read the contents of the communication register of the western neighbour and write it into the own communication register), denoted by an arrow symbol in Figure 1.

Enable Java to see an animation of this figure
Broadcast along the first row of the array
Figure 1: Broadcast along the first row of the array

 

We call such a propagation from left to right (or from top to bottom) a broadcast, since it is realized by only one instruction, just as if there were a broadcast line connecting all processors of a row (or column, respectively). Sending a data item from right to left along a row takes 2n instructions. So one would write programs in a way that whenever possible propagating data is done from left to right or from top to bottom.

In the notation of the programming language Laisa a broadcast can be expressed like this

procedure broadcast_row(n : integer; s : selector);

begin(n)
    < set CW, C; s; [2..n] >;
end;

The program broadcasts the communication register contents of the processor in the first column of the processor array to all processors of the corresponding row; this is done in all rows specified by row selector s.

For instance, the program of Figure 3 is expressed in the following way:

broadcast_row (n, [1]);

Input

The processors of the first column of the array can perform input over the connection to their non-existing left neighbours. Similarly, the processors of the first row of the array can perform input over the connection to their non-existing top neighbours. Actually, there are input buffers available at the left and upper border of the array.

Input is similar to the broadcast operation. The following program shows how a data item x is input into the first row of the array.

Enable Java to see an animation of this figure
Input to the first row of the array
Figure 2: Input to the first row of the array

Aggregation

If the instruction C:=CW is replaced with C:=C+CW (add the contents of the communication register of the western neighbour to the own communication register), a summation over an entire row can be performed with one instruction. If this is done in all rows simultaneously, followed by the instruction C:=C+CN executed in the last column, the sum of all communication register contents of the entire array is obtained in the lower right processor (Figure 3).

Enable Java to see an animation of this figure
Summation over the whole array
Figure 3: Summation over the whole array

 

In the notation of Laisa this program reads like this:

procedure array_sum(n : integer);

begin(n)
    < add C, CW, C; [1..n]; [2..n] >;
    < add C, CN, C; [2..n]; [n] >;
end;

The program above says that the first addition shall take place in rows 1, ..., n and in columns 2, ..., n, and the second addition in rows 2, ..., n but only in column n.

Aggregation, as well as broadcast, can only performed from left to right with a constant number of instructions.

Ring shift

The fictitious broadcast line can be used as a "wrap-around connection" of the array in order to realize ring shift operations. If the communication register contents of processors 2, ..., n of a row are simultaneously read by their left neighbours, a left shift of the data of this row is performed. However, the data item in the first processor would be lost. Therefore, this data item is broadcast to the last processor, yielding a ring shift of the data items. Figure 4 shows the program for a 4 × 4 array. Again we have the result that a constant number of instructions suffice to realize a ring shift operation, just as if the array had wrap-around connections.

Enable Java to see an animation of this figure
Ring shift along the first row of the array
Figure 4: Ring shift along the first row of the array

In the notation of the programming language Laisa a ring shift of a row by k steps is expressed in the following way:

procedure ringshift_row(n : integer, k: integer, s : selector);
{ performs k ring shift steps in rows specified by selector s }
begin(n)
    repeat k times
    begin
        < set CE, C; s; [2..n] >;
        < set CW, C; s; [1..n-1] >;
    end
end;

For instance, the program of Figure 4 would be called by

ringshift_row(n, 1, [1]);

since it performs a ring shift by 1 step in row 1.

 

Broadcast and ring shift as well as the aggregation operation have turned out as valuable means for writing efficient parallel programs. While allowing communication along a whole row or column, these operations use only local communication and take only one or two instructions. But the number of instructions is the natural measure for the time complexity of an ISA program. The matrix multiplication program and the transitive closure program in the next section are examples for the use of broadcast and ringshift operations.

 

Next:   [Parallel programs]   or   up

 

homeH.W. Lang   Hochschule Flensburg   lang@hs-flensburg.de   Impressum   Datenschutz   ©   Created: 24.09.2002   Updated: 20.06.2018
Valid HTML 4.01 Transitional