SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Presentation
Media
Abstract:
Join a special presentation from our 2018-2019 Graduate Fellowship recipients to learn what's next from the world of research and academia. Sponsored projects involve a variety of technical challenges, including topics such as 3D scene understanding, new programming models for tensor computations, HPC physics simulations for astrophysics, deep learning algorithms for AI natural language learning, and cancer diagnosis. We believe that theses students will lead the future in our industry and we're proud to support the 2018-2019 NVIDIA Graduate Fellows. For more information on the NVIDIA Graduate Fellowship program, visit www.nvidia.com/en-us/research/graduate-fellowships.
Join a special presentation from our 2018-2019 Graduate Fellowship recipients to learn what's next from the world of research and academia. Sponsored projects involve a variety of technical challenges, including topics such as 3D scene understanding, new programming models for tensor computations, HPC physics simulations for astrophysics, deep learning algorithms for AI natural language learning, and cancer diagnosis. We believe that theses students will lead the future in our industry and we're proud to support the 2018-2019 NVIDIA Graduate Fellows. For more information on the NVIDIA Graduate Fellowship program, visit www.nvidia.com/en-us/research/graduate-fellowships.  Back
 
Topics:
AI & Deep Learning Research, Virtual Reality & Augmented Reality, Graphics and AI, Computational Biology & Chemistry, Computer Vision
Type:
Talk
Event:
GTC Silicon Valley
Year:
2019
Session ID:
S9976
Streaming:
Download:
Share:
 
Abstract:
We'll describe the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image (e.g., "What kind of store is this?", "How many people are waiting in the queue?", "Is it safe to cross the street?"), the machine's task is to automatically produce an accurate natural language answer ("bakery", "5", "Yes"). Answering any possible question about an image is one of the 'holy grails' of AI requiring integration of vision, language, and reasoning. We have collected and recently released a dataset containing >250,000 images, >750,000 questions, and ~10 Million answers (www.visualqa.org). We are also running VQA challenge (www.visualqa.org/challenge.html) which includes both an open-ended answering task and a multiple-choice task.
We'll describe the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image (e.g., "What kind of store is this?", "How many people are waiting in the queue?", "Is it safe to cross the street?"), the machine's task is to automatically produce an accurate natural language answer ("bakery", "5", "Yes"). Answering any possible question about an image is one of the 'holy grails' of AI requiring integration of vision, language, and reasoning. We have collected and recently released a dataset containing >250,000 images, >750,000 questions, and ~10 Million answers (www.visualqa.org). We are also running VQA challenge (www.visualqa.org/challenge.html) which includes both an open-ended answering task and a multiple-choice task.  Back
 
Topics:
Computer Vision, Artificial Intelligence and Deep Learning, Big Data Analytics
Type:
Talk
Event:
GTC Silicon Valley
Year:
2016
Session ID:
S6745
Streaming:
Download:
Share: