Hear about the latest developments concerning the NVIDIA GPUDirect family of technologies, which are aimed at improving both the data and the control path among GPUs, in combination with third-party devices. We''ll introduce the fundamental concepts behind GPUDirect and present the latest developments, such as changes to the pre-existing APIs, the new APIs recently introduced. We''ll also discuss the expected performance in combination with the new computing platforms that emerged last year.
In the GPU off-loading programming model, the CPU is the initiator, e.g. it prepares and orchestrates work for the GPU. In GPU-accelerated multi-node programs, the CPU has to do the same for the network interface as well. But the truth is that both the GPU and the network have sophisticated hardware resources, and these can be effectively short-circuited so to get rid of the CPU altogether. Meet PeerSync, which is a set of CUDA-Infiniband Verbs interoperability APIs which opens an unlimited number of possibilities. It also provides a scheme to go beyond the GPU-network duo, i.e. effectively employing the same ideas to other 3rd party devices.
APEnet+ is a novel cluster interconnect, based on a custom PCI card which features a PCI Express Gen2 X8 link and a re-configurable HW component (FPGA). It supports a 3D Torus topology and has special acceleration features specifically developed for NVIDIA Fermi GPUs. An introduction to the basic features and the programming model of APEnet+ will be followed by a description of its performance on some numerical simulations, e.g. High Energy Physics simulations.