共计 15 篇文章
2023
论文笔记 VidChapters-7M Video Chapters at Scale Video Captioning 论文笔记 Vid2Seq Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning 论文笔记 Human-centric Behavior Description in Videos New Benchmark and Model 论文笔记 UCF-Crime Annotation A Benchmark for Surveillance Video-and-Language Understanding 基于梯度下降算法的Zero-shot Captioning方法 论文笔记 Zero-Shot Scene Graph Relation Prediction through Commonsense Knowledge Integration2022
End-to-end Generative Pretraining for Multimodal Video Captioning 论文笔记 GIT A Generative Image-to-text Transformer for Vision and Language 论文笔记 SwinBERT End-to-End Transformers with Sparse Attention for Video Captioning 论文笔记 Open-book Video Captioning with Retrieve-Copy-Generate Network论文笔记