Параллельные вычисления в ИММ УрО РАН
|
|
Next: Checking the BLAS and Up: Performance Evaluation Previous: Performance Evaluation
Obtaining High Performance with ScaLAPACK CodesWe suggest the following approach to obtain high performance with ScaLAPACK codes:
The standard data distribution will typically achieve 25-50% of the peak performance possible (depending in part on how many processors are ignored, i.e., the difference between and ). We do not recommend experimenting with different data distributions until performance that is acceptable (or nearly so) has been achieved. If each individual node requires a block size larger than 64 to achieve near-peak performance on local matrix-matrix multiply, the block size may have to be increased. This step is unlikely, however, unless the computer has a shared-memory multiprocessor with more than four processors on each node.
Susan Blackford Tue May 13 09:21:01 EDT 1997 |