Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training model for 3D VG #7

Open
col14m opened this issue Oct 23, 2024 · 3 comments
Open

Training model for 3D VG #7

col14m opened this issue Oct 23, 2024 · 3 comments

Comments

@col14m
Copy link

col14m commented Oct 23, 2024

Hello. Could you please advise me on how to properly train a model for 3D VG on ScanRefer: model, losses, dataset, metrics?

Your current model can predict bounding boxes only as text and only with an additional click on the object, if I understood everything correctly.

@col14m col14m changed the title Traning model for 3D VG Training model for 3D VG Oct 23, 2024
@ZCMax
Copy link
Owner

ZCMax commented Oct 24, 2024

Yes, the current code only supports click-based 3D bounding box outputs, but we will release an update next week that includes support for purely language-guided 3D visual grounding tasks. Currently the code does not officially support 3D Visual Grounding Task, which requires the extra grounding head to achieve the accurate grounding results. We’ve tried simply output the 3D bounding box of object in text or location token format in the 3D VG cases, and found that it does not work well~

@xjj1999
Copy link

xjj1999 commented Nov 1, 2024

Hello, Has the 3D visual grouding module been released yet?

@ZCMax
Copy link
Owner

ZCMax commented Nov 6, 2024

Sorry for the late reply, we would release the 3D VG related code after the CVPR ddl~, I‘m sorry for that and thanks for your understanding~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants