FlyAI小助手

  • 3

    获得赞
  • 85873

    发布的文章
  • 0

    答辩的项目

An Open and Comprehensive Pipeline for Unified Object Grounding and Detection

作者: Xiangyu Zhao

作者邀请

论文作者还没有讲解视频

邀请直播讲解

您已邀请成功, 目前已有 $vue{users_count} 人邀请!

再次邀请

Grounding-DINO is a state-of-the-art open-set detection model that tackles multiple vision tasks including Open-Vocabulary Detection (OVD), Phrase Grounding (PG), and Referring Expression Comprehension (REC). Its effectiveness has led to its widespread adoption as a mainstream architecture for various downstream applications. However, despite its significance, the original Grounding-DINO model lacks comprehensive public technical details due to the unavailability of its training code. To bridge this gap, we present MM-Grounding-DINO, an open-source, comprehensive, and user-friendly baseline, which is built with the MMDetection toolbox. It adopts abundant vision datasets for pre-training and various detection and grounding datasets for fine-tuning. We give a comprehensive analysis of each reported result and detailed settings for reproduction. The extensive experiments on the benchmarks mentioned demonstrate that our MM-Grounding-DINO-Tiny outperforms the Grounding-DINO-Tiny baseline. We release all our models to the research community. Codes and trained models are released at https://github.com/open-mmlab/mmdetection/tree/main/configs/mm_grounding_dino.

文件下载
本作品采用 知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可,转载请附上原文出处链接和本声明。
本文链接地址:https://www.flyai.com/paper_detail/86896
讨论
500字
表情
发送
删除确认
是否删除该条评论?
取消 删除