GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Accelerated Data Science
Presentation
Media
RAPIDS CUDA DataFrame Internals for C++ Developers
Abstract:

The core of RAPIDS is CUDA DataFrame (cuDF), a library that provides Pandas-like DataFrame (a columnar data structure) functionality with GPU acceleration. cuDF provides a Python interface for use in existing data science workflows, and underneath cuDF is libcuDF, an open-source CUDA C++ library that provides a column data structure and algorithms to operate on these columns, such as filtering, selection, sorting, joining, and groupby. In this talk you will learn about some of the C++ and CUDA internals of libcuDF. This talk will cover how we perform run-time type dispatch on type-erased data structures to enable operating on a variety of data types and interface with dynamic languages like Python. Well describe how and why we built a pool allocator for CUDA device memory to massively improve performance on multi-GPU systems. And well dive into GPU algorithms we use for multi-column database operations like groupby and join. If you are interested in using GPU DataFrames via libcuDFs C/C++ interface, or if you are interested in contributing to the cuDF / libcuDF open source project, then this talk is for you.

 
Topics:
Accelerated Data Science
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S91043
Streaming:
Download:
Share: