Kavli Affiliate: Philip Marshall
| First 5 Authors: Alex Brooks, Philip Marshall, David Ozog, Md. Wasi-ur- Rahman, Lawrence Stewart
| Summary:
Modern high-end systems are increasingly becoming heterogeneous, providing
users options to use general purpose Graphics Processing Units (GPU) and other
accelerators for additional performance. High Performance Computing (HPC) and
Artificial Intelligence (AI) applications are often carefully arranged to
overlap communications and computation for increased efficiency on such
platforms. This has led to efforts to extend popular communication libraries to
support GPU awareness and more recently, GPU-initiated operations. In this
paper, we present Intel SHMEM, a library that enables users to write programs
that are GPU aware, in that API calls support GPU memory, and also support
GPU-initiated communication operations by embedding OpenSHMEM style calls
within GPU kernels. We also propose thread-collaborative extensions to the
OpenSHMEM standard that can enable users to better exploit the strengths of
GPUs. Our implementation adapts to choose between direct load/store from GPU
and the GPU copy engine based transfer to optimize performance on different
configurations.
| Search Query: ArXiv Query: search_query=au:”Philip Marshall”&id_list=&start=0&max_results=3