GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Computer Vision
Presentation
Media
VQA: Visual Question Answering
Abstract:
We'll describe the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image (e.g., "What kind of store is this?", "How many people are waiting in the queue?", "Is it safe to cross the street?"), the machine's task is to automatically produce an accurate natural language answer ("bakery", "5", "Yes"). Answering any possible question about an image is one of the 'holy grails' of AI requiring integration of vision, language, and reasoning. We have collected and recently released a dataset containing >250,000 images, >750,000 questions, and ~10 Million answers (www.visualqa.org). We are also running VQA challenge (www.visualqa.org/challenge.html) which includes both an open-ended answering task and a multiple-choice task.
 
Topics:
Computer Vision, Artificial Intelligence and Deep Learning, Big Data Analytics
Type:
Talk
Event:
GTC Silicon Valley
Year:
2016
Session ID:
S6745
Streaming:
Download:
Share: