Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some questions about nuscenes multi-task support #17

Open
Liaoqing-up opened this issue Dec 14, 2022 · 2 comments
Open

some questions about nuscenes multi-task support #17

Liaoqing-up opened this issue Dec 14, 2022 · 2 comments

Comments

@Liaoqing-up
Copy link

Thanks for releasing the nuscenes dataset code support. I have some questions about the implement of the multi-tasks. I see in the code that you define obj_num=500 for each task and then the task_id will be added to the pos embedding to identify each task in rpn transformer. But unfortunately, the computation increases, and my machine directly throw the error that the cuda memory OOM. As for the implement of multi-task, my intuitive idea is that each task has its own head during the generation of heatmap. Then, all heatmaps are contacted to one tensor and generate top500 center queries, then sent to rpn transformer, Meanwhile, the pos feature is also the regular x and y coordinates. In the final output detection head, each task have their own detection head applying to transformer output features, which can reduce the increasing computation in transformer layer. This is my first thought, I wonder if you has experimented this way, is there any drawbacks? Could you share the effects or conclusions or something like that? It is very important to me. Thank you ~

@Liaoqing-up
Copy link
Author

By the way, have you experimented the time sequence fusion through the rpn transformer in nuscenes dataset? How does it work?

@edwardzhou130
Copy link
Collaborator

Thanks for releasing the nuscenes dataset code support. I have some questions about the implement of the multi-tasks. I see in the code that you define obj_num=500 for each task and then the task_id will be added to the pos embedding to identify each task in rpn transformer. But unfortunately, the computation increases, and my machine directly throw the error that the cuda memory OOM. As for the implement of multi-task, my intuitive idea is that each task has its own head during the generation of heatmap. Then, all heatmaps are contacted to one tensor and generate top500 center queries, then sent to rpn transformer, Meanwhile, the pos feature is also the regular x and y coordinates. In the final output detection head, each task have their own detection head applying to transformer output features, which can reduce the increasing computation in transformer layer. This is my first thought, I wonder if you has experimented this way, is there any drawbacks? Could you share the effects or conclusions or something like that? It is very important to me. Thank you ~

Hi, sorry for the late reply. I agree with you that the current method is a bit cumbersome. Some tasks may not need that much of center candidates. But there will be some issues if you select the top K centers from a merged heatmap:

  1. It is hard to merge the scores or select a suitable threshold for the center candidates. Some tasks may have lower heatmap scores than others.
  2. Different tasks may have the same high response region. I found it has better results if each task is dealt with separately.

I also found the computation cost increase is relatively small since the transformer part of CenterFormer is already lightweight. Hence, I choose to implement it in this way. If you still have the memory issue, consider reducing the batch size or obj_num.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants