Export 12 results:
Williams, S., J. Carter, L. Oliker, J. Shalf, and K. Yelick, "Lattice Boltzmann Simulation Optimization on Leading Multicore Platforms", Journal of Parallel and Distributed Computing, vol. 69, no. 9, pp. 762-777, 2009.
Hall, M., J. Chame, J. Shin, C. Chen, G. Rudy, and M M. Khan, "Loop Transformation Recipes for Code Generation and Auto-Tuning", Proceedings of the Workshop on Languages and Compilers for Parallel Computing, oct, 2009.
Madduri, K., S. Williams, S. Ethier, L. Oliker, J. Shalf, E. Strohmaier, and K. Yelick, "Memory-efficient Optimization of Gyrokinetic Particle-to-Grid Interpolation for Multicore Processors", Proc.\ ACM/IEEE Conf.\ on Supercomputing (SC 2009): The Parallel Computing Laboratory, pp. 48:1–48:12, 2009.
Porterfield, A., N. Nassar, and R. Fowler, "Multi-Threaded Library for Many-Core Systems", Workshop on Multithreaded Architectures and Applications, Rome, Italy, IEEE, 2009.
Tikir, M., M. Laurenzano, L. Carrington, and A. Snavely, "PSINS: An Open Source Event Tracer and Execution Simulator for MPI Applications", Euro-PAR 2009, August, 2009.
Olschanowsky, C M., M. M. Tikir, L. Carrington, and A. Snavely, "PSnAP: accurate synthetic address streams through memory profiles", Workshop on Languages and Compilers for Parallel Computing (LCPC 2009), pp. 353-367, 2009.
Fuerlinger, K, M. S., "Recording the Control Flow of Parallel Applications to Determine Iterative and Phase-Based Behavior", Future Generation Computing Systems, vol. 26, no. 1, pp. 162-166, january, 2009.
Williams, S., J. Carter, L. Oliker, J. Shalf, and K. Yelick, "Resource-Efficient, Hierarchical Auto-Tuning of a Hybrid Lattice Boltzmann Computation on the Cray XT4", Proc. CUG09: Cray User Group meeting: The Parallel Computing Laboratory, 2009.
Tiwari, A., C. Chen, J. Chame, M. Hall, and J. K. Hollingsworth, "A scalable auto-tuning framework for compiler optimization", Proceedings of the International Parallel and Distributed Processing Symposium, apr, 2009.
Tiwari, A., V. Tabatabaee, and J. K. Hollingsworth, "Tuning parallel applications in parallel", Parallel Comput., vol. 35, no. 8-9, pp. 475–492, August, 2009.
Bronevetsky, G., I. Laguna, S. Bagchi, B. R. de Supinski, D. H. Ahn, and M. Schulz, "AutomaDeD: Automata-Based Debugging for Dissimilar Parallel Tasks", 2010 IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Chicago, IL, pp. 231 -240, 2010.
Bosilca, G., C. Coti, T. Herault, P. P. Lemarinier, and J. Dongarra, "Constructing Resilient Communication Infrastructure for Runtime Environments", Parallel Computing: From Multicores and GPU's to Petascale: IOS Press, pp. 441-451, 2010.
Morris, A., {\bf. A. Malony}, S. Shende, and K. Huck, "Design and Implementation of a Hybrid Parallel Performance Measurement System", International Conference on Parallel Processing (ICPP 2010), San Diego, CA, pp. 492 - 501, sep, 2010.
Preissl, R., B. R. de Supinski, M. Schulz, D. J. Quinlan, D. Kranzlmuller, and T. Panas, "Exploitation of dynamic communication patterns through static analysis", 39th International Conference on Parallel Processing (ICPP 2010), San Diego, CA, pp. 51-60, sep, 2010.
Olschanowsky, C., L. Carrington, M. Tikir, M. Laurenzano, T. Rosing, and A. Snavely, "Fine-grained Energy Consumption Characterization and Modeling", DoD HPCMP UGC2010, 2010.