Publications

Export 12 results:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
L
Williams, S., J. Carter, L. Oliker, J. Shalf, and K. Yelick, "Lattice Boltzmann Simulation Optimization on Leading Multicore Platforms", Journal of Parallel and Distributed Computing, vol. 69, no. 9, pp. 762-777, 2009.
Williams, S., J. Carter, L. Oliker, J. Shalf, and K. Yelick, "Lattice Boltzmann Simulation Optimization on Leading Multicore Platforms}", Interational Conference on Parallel and Distributed Computing Systems (IPDPS), Miami, Florida, pp. 1-14, 2008.
Huck, K. A., K. Potter, D. W. Jacobsen, H. Childs, and A. D. Malony, "Linking Performance Data into Scientific Visualization Tools", 1st Workshop on Visual Performance Analysis (VPA), Held in conjunction with SC14, New Orleans, LA, USA, IEEE, 11/2014.
Pearce, O., T. Gamblin, B. R. de Supinski, T. Arsenlis, and N. M. Amato, "Load Balancing N-Body Simulations with Highly Non-Uniform Density", International Conference on Supercomputing (ICS'14), 06/2014.
Ma, T., A. Bouteiller, G. Bosilca, and J. Dongarra, "Locality and topology aware intra-node communication among multicore CPUs", 17th EuroMPI Conference: Springer-Verlag Berlin, Heidelberg} note = {performance, sep, 2010.
Venkat, A., M. W. Hall, and M. Strout, "Loop and data transformations for sparse matrix code", Proceedings of the 36th {ACM} {SIGPLAN} Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, 2015.
Hall, M., J. Chame, J. Shin, C. Chen, G. Rudy, and M M. Khan, "Loop Transformation Recipes for Code Generation and Auto-Tuning", Proceedings of the Workshop on Languages and Compilers for Parallel Computing, oct, 2009.
M
Marin, G., G. Jin, and J. Mellor-Crummey, "Managing Locality in Grand Challenge Applications: A Case Study of the Gyrokinetic Toroidal Code", Proc. of {SciDAC} 2008, J. of Physics: Conference Series: Institute of Physics Publishing, June, 2008.
Weaver, V., M. Johnson, K. Kasichayanula, J. Ralph, P. Luszczek, D. Terpstra, and S. Moore, "Measuring energy and power with PAPI", International Workshop on Power-Aware Systems and Architectures (PASA 2012), Pittsburgh, PA, September 10, 2012.
Ho, C-H., M. de Kruijf, K. Sankaralingam, B. Rountree, M. Schulz, and B. de Supinski, "Mechanisms and evaluation of cross-layer fault-tolerance for supercomputing", In the 41st International Conference on Parallel Processing (ICPP), Pittsburgh, PA, Sep, 2012.
Madduri, K., S. Williams, S. Ethier, L. Oliker, J. Shalf, E. Strohmaier, and K. Yelick, "Memory-efficient Optimization of Gyrokinetic Particle-to-Grid Interpolation for Multicore Processors", Proc.\ ACM/IEEE Conf.\ on Supercomputing (SC 2009): The Parallel Computing Laboratory, pp. 48:1–48:12, 2009.
Su, CY., D. Li, D. S. Nikolopoulos, K. W. Cameron, B. R. de Supinski, and E. A. Leon, "Model-based, memory-centric performance and power optimization on NUMA multiprocessors", IEEE International Symposium on Workload Characterization (IISWC) , 4-6 Novemeber ,2012 , La Jolla . California, 2012.
Meswani, M. R., L. Carrington, D. Unat, A. Snavely, S. B. Baden, and S. Poole, "Modeling and predicting performance of high performance computing applications on hardware accelerators", International Journal of High Performance Computing Applications (IJHPCA) , vol. 27, pp. 89-108, 2012.
Tiwari, A., , L. Carrington, and A. Snavely, "Modeling Power and Energy Usage of HPC Kernels", International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW) ,21-25 May 2012, Shanghai, China, IEEE Computing Society, Washington D.C , USA , 2012.

Pages