Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support unfixed kv heads number #1416

Conversation

mangguo321
Copy link
Contributor

@mangguo321 mangguo321 commented Dec 20, 2024

Fix decilm-7b-instruct benchmark test failure. The number heads per layer is not fixed in decilm-7b-instruct model, current code can not handle such case. JIRA ticket CVS-157864.

@ilya-lavrenov
Copy link
Contributor

Please, fix failed tests for speculative decoding and prompt lookup - they are failed because of changes on continuous batching impl.

Also, please update what is specific in decilm model so we need such changes?

@ilya-lavrenov ilya-lavrenov added this to the 2025.0 milestone Dec 24, 2024
@mangguo321 mangguo321 force-pushed the mang/fix_decilm-7b-instruct_failure branch from d7dd723 to db866ee Compare December 26, 2024 14:11
@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Dec 27, 2024
Merged via the queue into openvinotoolkit:master with commit 842c99e Dec 27, 2024
59 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants