Loading…
Title

Co-attentional transformers for story-based video ...

Description
Fuse vision and language in a meaningful way, we adopt a co-attentional transformer inspired by recent work in visual dialog [11].
/0