A script-based autotuning compiler system to generate high-performance CUDA code