The efficiency of numerical algorithms used to be measured by the number of operations (additions, subtractions, multiplications, and divisions) required. Today, this may no longer give a reasonable estimate of the relative performance of the methods. Data locality and potential for parallelism (the ability to make effective use of multiple processors simultaneously) are important properties of any numerical algorithm.

In the LAPACK project, dense linear algebra algorithms were implemented in such a way as to minimize memory references and allow the greatest possible use of parallelism. More recently, we have implemented parallel integral equation solvers and iterative methods connected with solving the neutron transport equation on the IBM-SP2.

For more details, see the references:

LAPACK Users' Guide
Second Edition, SIAM, 1995 ,
by E. Andersen, et al.

Fast Parallel Iterative Solution of Poisson's and the Biharmonic Equations on Irregular Regions
SIAM J. Sci. Stat. Comput., 13 (1992), pp. 101-117, ,
by A. Mayo and A. Greenbaum.