Training model for 3D VG #7

col14m · 2024-10-23T16:42:02Z

Hello. Could you please advise me on how to properly train a model for 3D VG on ScanRefer: model, losses, dataset, metrics?

Your current model can predict bounding boxes only as text and only with an additional click on the object, if I understood everything correctly.

ZCMax · 2024-10-24T05:00:32Z

Yes, the current code only supports click-based 3D bounding box outputs, but we will release an update next week that includes support for purely language-guided 3D visual grounding tasks. Currently the code does not officially support 3D Visual Grounding Task, which requires the extra grounding head to achieve the accurate grounding results. We’ve tried simply output the 3D bounding box of object in text or location token format in the 3D VG cases, and found that it does not work well~

xjj1999 · 2024-11-01T02:47:13Z

Hello, Has the 3D visual grouding module been released yet?

ZCMax · 2024-11-06T13:49:53Z

Sorry for the late reply, we would release the 3D VG related code after the CVPR ddl~, I‘m sorry for that and thanks for your understanding~

col14m changed the title ~~Traning model for 3D VG~~ Training model for 3D VG Oct 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training model for 3D VG #7

Training model for 3D VG #7

col14m commented Oct 23, 2024 •

edited

Loading

ZCMax commented Oct 24, 2024

xjj1999 commented Nov 1, 2024

ZCMax commented Nov 6, 2024

Training model for 3D VG #7

Training model for 3D VG #7

Comments

col14m commented Oct 23, 2024 • edited Loading

ZCMax commented Oct 24, 2024

xjj1999 commented Nov 1, 2024

ZCMax commented Nov 6, 2024

col14m commented Oct 23, 2024 •

edited

Loading