PAPI 5.4.1 is now available from http://icl.utk.edu/papi/. This is a minor release with a major rewrite of the CUDA component as well as several other enhancements and bug fixes.
This release provides CUDA 6.5 support for multiple GPU devices and multiple CUDA contexts.
There have been several other bug fixes and enhancements:
Updated support for Intel Haswell and Haswell-EP
Added ARM Cortex A7
Added ARM 1176 CPU (original Raspberry Pi)
Enhance PAPI preset events to allow user defined events.
User defined events are set up via a user event definition file.
New test demonstrating attaching an eventset to a single CPU rather than a thread.
Use the term “event qualifiers” instead of “event masks” to clarify understanding.
Added pkg-config support to PAPI.
Fixed lustre segfault bug in lustre component.
Fixed compilation in the absence of a Fortran compiler.
Fixed bug in krental_pthreads ctest to join threads properly on exit.
Fixed bug in perf_events where event masks were not getting cleared properly.
Fixed memory leak bug in perf_events.
More details on these changes can be found in the file ChangeLogP541.txt inside the PAPI tarball.
Last but not least, a special Thank You goes out to all our collaborators and contributors!