Abstract:
Graphs with billions of edges do not fit within the device memory of a single GPU. So to explore large graphs, it is necessary to resort to multiple GPUs. Besides the techniques required to improve the load balancing among threads, it is necessary to reduce the communication overhead among GPUs. To that purpose we resort to a pruning procedure to eliminate redundant data and to a new interconnection technology, called APEnet, that is the first, non-NVIDIA, device to exploit the possibilities offered by the GPUdirect technology. Our results show that APEnet performs better than Infiniband and may become a viable alternative for the connectivity of future GPU clusters.