VEGGIE：通过 grounded generation 进行指令式编辑与视频概念推理

一句话结论

这篇工作把 instruction-based 视频编辑与视频概念推理放到同一框架里，补上了 editing 和 understanding 交界处的一条关键路线。

问题定义

它要解决的是视频编辑系统不只是要会改视频，还要能围绕视频概念进行更 grounded 的理解与操作。对当前知识库来说，它非常适合填厚 video-understanding 这一页。

方法概述

VEGGIE 将 instructional editing、reasoning video concepts 和 grounded generation 结合起来，让视频编辑过程和概念层推理更紧密耦合。

关键发现

它说明 video-editing 与 video-understanding 之间的界面正在变得更直接，编辑系统开始显式吸收 reasoning 需求。
它为“模型是否真正理解了编辑任务”这个问题提供了更贴近能力本体的样本。
它也把 vision-language 与 grounded generation 更强地拉入视频编辑主线。

局限或疑问

概念推理是否真的提升了最终编辑完成度，还需要更多细粒度评测支持。
grounded generation 路线往往系统更复杂。
它是很强的方向性工作，但还需要更多后续跟进才能看清稳定收益。

原始链接

https://openaccess.thecvf.com/content/ICCV2025/html/Yu_VEGGIE_Instructional_Editing_and_Reasoning_Video_Concepts_with_Grounded_Generation_ICCV_2025_paper.html
https://openaccess.thecvf.com/content/ICCV2025/papers/Yu_VEGGIE_Instructional_Editing_and_Reasoning_Video_Concepts_with_Grounded_Generation_ICCV_2025_paper.pdf

备注

VEGGIE 在这套库里的作用，是把视频编辑与视频理解交界处的 reasoning 路线补成明确节点。

元数据

{ "id": "2026-04-14-veggie", "type": "source", "title": "VEGGIE（ICCV 2025）：通过 grounded generation 进行指令式编辑与视频概念推理", "status": "reviewed", "created": "2026-04-14", "updated": "2026-04-15", "venue": "ICCV 2025", "ingested_at": "2026-04-14", "tags": [ "near-cvpr-2025", "video-editing", "video-understanding", "vision-language", "reasoning", "primary-source" ], "note_status": "reviewed", "source_type": "paper", "authors": [ "Shoubin Yu", "Difan Liu", "Ziqiao Ma", "Yicong Hong", "Yang Zhou", "Hao Tan", "Joyce Chai", "Mohit Bansal" ], "published_at": "2025-01-01", "canonical_links": [ "https://openaccess.thecvf.com/content/ICCV2025/html/Yu_VEGGIE_Instructional_Editing_and_Reasoning_Video_Concepts_with_Grounded_Generation_ICCV_2025_paper.html", "https://openaccess.thecvf.com/content/ICCV2025/papers/Yu_VEGGIE_Instructional_Editing_and_Reasoning_Video_Concepts_with_Grounded_Generation_ICCV_2025_paper.pdf" ], "raw_entry": "raw/ingest/2026-04-14-veggie/", "topics": [ "topics/video-editing", "topics/video-understanding", "topics/vision-language" ], "entities": [ "entities/video-editing-understanding" ], "claims": [], "questions": [ "questions/question-do-benchmarks-track-real-video-editing-understanding" ] }

VEGGIE通过 grounded generation 进行指令式编辑与视频概念推理