GTC ON-DEMAND

 
SEARCH SESSIONS
SEARCH SESSIONS

Search All
 
Refine Results:
 
Year(s)

SOCIAL MEDIA

EMAIL SUBSCRIPTION

 
 

GTC ON-DEMAND

Computer Vision
Presentation
Media
Using Multimodal Learning for TV Show Summarization
Abstract:
We'll explore new techniques for TV show summarization using multimodal deep learning for saliency detection and fusion. For TV show summarization, the goal is to compact visual summary with informativeness and enjoyability to attract audience. In our work, we propose a multimodal summarization platform to integrate the multimodal saliences learned from video, audio, and text. Our work focuses on three aspects: 1) the saliency extraction for video, audio, and text using deep learning networks; 2) fusion framework design for multimodal information integration; 3) developing tools to speed up video processing. Using AI Vision, which is a public cloud-based AI service, we summarize a TV show with 11 hours duration in one minute.
 
Topics:
Computer Vision, Intelligent Video Analytics, Video & Image Processing
Type:
Talk
Event:
GTC Silicon Valley
Year:
2018
Session ID:
S8221
Streaming:
Share: