Kavli Affiliate: Jing Wang | First 5 Authors: Yutao Jin, Bin Liu, Jing Wang, , | Summary: The application of video captioning models aims at translating the content of videos by using accurate natural language. Due to the complex nature inbetween object interaction in the video, the comprehensive understanding of spatio-temporal relations of objects remains […]
Continue.. Video Captioning with Aggregated Features Based on Dual Graphs and Gated Fusion