Release of data / checkpoint / demo #3

ZhangGongjie · 2024-10-15T09:26:24Z

Hi, awesome job!

I am looking forward to the release of data/checkpoint/demo!

xjj1999 · 2024-10-15T13:42:01Z

I wish I could use this masterpiece sooner too.

ZCMax · 2024-10-17T03:47:14Z

Thank you for your attention. We will be releasing the checkpoint and demo by the end of this week, but, due to certain constraints, we are unable to release the training data at this time. We'll keep updating the repo and please stay tuned for further updation. @ZhangGongjie @xjj1999

ZCMax · 2024-10-18T18:58:19Z

We've released our checkpoint on huggingface and the demo script, feel free to try it! And we'll continue to update the evaluation script on various benchmarks in the next week~

xjj1999 · 2024-10-19T17:45:31Z

Hi,
I have designed questions with reference to the paper to test the performance of the model on 3D Visual Grounding and I can't get the desired answer, how should I design the query .

python ./llava/eval/run_llava_3d.py --model-path ./LLaVA-3D-7B --video-path ./demo/scannet/scene0356_00 --query "A rectangular brown door. It is next to a bed. Which object best matches the given description? Please provide its coordinates."

The output is ‘Bed.’

ZCMax · 2024-10-20T03:53:49Z

Hello, currently the code does not officially support 3D Visual Grounding Task, which requires the extra grounding module to achieve the accurate grounding results. We’ve tried simply output the 3D bounding box of object in text or location token format in the 3D VG cases, and found that it does not work well~ We’ll continue to update the VG code in the next week Get Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: xjj1999 ***@***.***> Sent: Sunday, October 20, 2024 1:45:53 AM To: ZCMax/LLaVA-3D ***@***.***> Cc: ChaimZhu ***@***.***>; Comment ***@***.***> Subject: Re: [ZCMax/LLaVA-3D] Release of data / checkpoint / demo (Issue #3) Hi, I have designed questions with reference to the paper to test the performance of the model on 3D Visual Grounding and I can't get the desired answer, how should I design the query . python ./llava/eval/run_llava_3d.py --model-path ./LLaVA-3D-7B --video-path ./demo/scannet/scene0356_00 --query "A rectangular brown door. It is next to a bed. Which object best matches the given description? Please provide its coordinates." — Reply to this email directly, view it on GitHub<#3 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AIZ5DAEF4OTRY2HDZND6WTTZ4KLFDAVCNFSM6AAAAABP6WFR7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRUGEYDKNJUGM>. You are receiving this because you commented.Message ID: ***@***.***>

xjj1999 · 2024-10-22T13:55:32Z

Thank you for your reply! Again, thank you for such awesome work. As far as I know, LLava-3d is the first work that directly predicts bbox but achieves such performance. I'm looking forward to testing LLava-3d on the 3D Grouding task this week!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release of data / checkpoint / demo #3

Release of data / checkpoint / demo #3

ZhangGongjie commented Oct 15, 2024

xjj1999 commented Oct 15, 2024

ZCMax commented Oct 17, 2024

ZCMax commented Oct 18, 2024

xjj1999 commented Oct 19, 2024 •

edited

Loading

ZCMax commented Oct 20, 2024 via email

xjj1999 commented Oct 22, 2024 •

edited

Loading

Release of data / checkpoint / demo #3

Release of data / checkpoint / demo #3

Comments

ZhangGongjie commented Oct 15, 2024

xjj1999 commented Oct 15, 2024

ZCMax commented Oct 17, 2024

ZCMax commented Oct 18, 2024

xjj1999 commented Oct 19, 2024 • edited Loading

ZCMax commented Oct 20, 2024 via email

xjj1999 commented Oct 22, 2024 • edited Loading

xjj1999 commented Oct 19, 2024 •

edited

Loading

xjj1999 commented Oct 22, 2024 •

edited

Loading