Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release of data / checkpoint / demo #3

Open
ZhangGongjie opened this issue Oct 15, 2024 · 6 comments
Open

Release of data / checkpoint / demo #3

ZhangGongjie opened this issue Oct 15, 2024 · 6 comments

Comments

@ZhangGongjie
Copy link

Hi, awesome job!

I am looking forward to the release of data/checkpoint/demo!

@xjj1999
Copy link

xjj1999 commented Oct 15, 2024

I wish I could use this masterpiece sooner too.

@ZCMax
Copy link
Owner

ZCMax commented Oct 17, 2024

Thank you for your attention. We will be releasing the checkpoint and demo by the end of this week, but, due to certain constraints, we are unable to release the training data at this time. We'll keep updating the repo and please stay tuned for further updation. @ZhangGongjie @xjj1999

@ZCMax
Copy link
Owner

ZCMax commented Oct 18, 2024

We've released our checkpoint on huggingface and the demo script, feel free to try it! And we'll continue to update the evaluation script on various benchmarks in the next week~

@xjj1999
Copy link

xjj1999 commented Oct 19, 2024

Hi,
I have designed questions with reference to the paper to test the performance of the model on 3D Visual Grounding and I can't get the desired answer, how should I design the query .

python ./llava/eval/run_llava_3d.py --model-path ./LLaVA-3D-7B --video-path ./demo/scannet/scene0356_00 --query "A rectangular brown door. It is next to a bed. Which object best matches the given description? Please provide its coordinates."

The output is ‘Bed.’

@ZCMax
Copy link
Owner

ZCMax commented Oct 20, 2024 via email

@xjj1999
Copy link

xjj1999 commented Oct 22, 2024

Thank you for your reply! Again, thank you for such awesome work. As far as I know, LLava-3d is the first work that directly predicts bbox but achieves such performance. I'm looking forward to testing LLava-3d on the 3D Grouding task this week!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants