N. Bell, S. Dalton, and L. Olson. Exposing Fine-Grained Parallelism in Algebraic Multigrid Methods. SIAM J. Sci. Comp., 34(4):C123–C152, 2012.
E. Chow and A. Patel. Fine-Grained Parallel Incomplete LU Factorization. SIAM J. Sci. Comp., 37(2):C169–C193, 2015.
E. Cuthill and J. McKee. Reducing the bandwidth of sparse symmetric matrices. In Proceedings of the 1969 24th National Conference, ACM '69, pages 157–172. ACM, 1969.
Fink.
G. H. Golub and C. F. Van Loan. Matrix Computations. John Hopkins University Press, 1996.
J. L. Greathouse and M. Daga. Efficient Sparse Matrix-Vector Multiplication on GPUs Using the CSR Storage Format. In Proc. HPC Netw., Stor. Anal., SC '14, pages 769–780. ACM, 2014.
F. Gremse, A. Höfter, L. O. Schwen, F. Kiessling, and U. Naumann. GPU-Accelerated Sparse Matrix-Matrix Multiplication by Iterative Row Merging. SIAM J. Sci. Comp., 37(1):C54–C71, 2015.
M. J. Grote and T. Huckle. Parallel Preconditioning with Sparse Approximate Inverses. SIAM J. Sci. Comp., pages 838–853, 1997.
T. Huckle. Factorized Sparse Approximate Inverses for Preconditioning. J. Supercomput., pages 109–117, 2003.
D. D. Lee and S. H. Seung. Algorithms for Non-negative Matrix Factorization. In Advances in Neural Information Processing Systems 13, page 556–562, 2000.
J. G. Lewis. Algorithm 582: The gibbs-poole-stockmeyer and gibbs-king algorithms for reordering sparse matrices. ACM Trans. Math. Softw., 8:190–194, 1982.
Y. Saad. Iterative Methods for Sparse Linear Systems, Second Edition. Society for Industrial and Applied Mathematics, April 2003.
Horst D. Simon. The lanczos algorithm with partial reorthogonalization. Mathematics of Computation, 42:115–142, 1984.
U. Trottenberg, C. Oosterlee, and A. Schüller. Multigrid. Academic Press, 2001.
U. M. Yang. Numerical Solutions of Partial Differential Equations on Parallel Computers, chapter Parallel Algebraic Multigrid Methods - High Performance Preconditioners, pages 209–236. Lecture Notes in Computational Science and Engineering. Springer, 2006.