What's the best type of interconnection network for multiprocessor MIMD machines with several hundred processors?

By Simon Loader

With a multiprocessor MIMD machine with several hundred processors it is important that memory access is fast and available. Each of the hundred processors will be accessing different memory location for data and instructions.

Any slow access to either will impede upon performance of the machine as a whole. A single shared memory with a single bus will not perform as the memory bus will become a bottle neck. They are often called Uniform Memory Access machines (UMAs). Therefore each processor will need memory access of its own.

One possibility is to give each processing element (PE) local memory of its own which can be shared across all the processors. This is often known as distributed shared memory MIMD and more popularly as Non-Uniform Memory Access machines (NUMA). It is also possible to use the local cache on a PE for local memory instead of external memory; these are called Cache-Only Memory Access architectures (COMA). Access to memory of any other processors local memory is slow so for fine grained processing which require access to other data and a lot of synchronisation performance is poor. Performance for coarse grained process however is good and this would be a suitable architecture.

For any fine grained processing careful consideration of the network to connect multiple processors need to be considered. Any distance for a processing element to contact another processing element must be kept as short as possible in order to allow fast communication. There are several interconnections can be possible. One is a simple mesh system where each processing element (PE) is connected to four PE`s around them and so on. This allows relative fast communication with the worst case of a processor required to go through is the square root of the number of PE`s. Hypercube is also popular where the distance between any 2 PE`s is ‘n’ where the number of processors equal to 2n.

Another way to alleviate the problem of fine grained processors on a large MIMD system is to use Cache Coherent NUMA`s. Each PE`s cache is a window onto the shared memory space which is kept coherent by various methods such as snooping access to memory it may have cached. This allows fast access to current processors while still allowing fast access to shared data. This can become a problem if the type of processing cause many cache misses.

The interconnection of a MIMD system with several hundred processors is difficult to decide upon without knowledge of the type of processing required. Due to the number of processors involved the simple shared memory architectures used for smaller MIMD architectures is not scalable. A new form of interconnection is required which can meet the demands of high number of processors.

About the author: Simon Loader is a UNIX and email specialist who runs Surf, a free IT resource and downloads website, in his spare time . Many of the downloads and articles on Surf created by Simon are featured in technical websites all over the world. To access FREE downloads and information covering several diverse IT topics visit www.surf.org.uk.