-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: make use of litellm's response "usage" data #2947
Comments
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
An attempt was made to automatically fix this issue, but it was unsuccessful. A branch named 'openhands-fix-issue-2947' has been created with the attempted changes. You can view the branch here. Manual intervention may be required. |
… of litellm's response "usage" data
… of litellm's response "usage" data
… of litellm's response "usage" data
… of litellm's response "usage" data
… of litellm's response "usage" data
… of litellm's response "usage" data
…make use of litellm's response "usage" data" This reverts commit d317a3e.
… of litellm's response "usage" data
… of litellm's response "usage" data
What problem or use case are you trying to solve?
TASK
We want to enhance our get_token_count() implementation in llm.py, to take advantage of the token counts we are provided from our dependencies, if available, and only fallback to count them if they are not already available.
Read the llm.py file. It uses the litellm library, and you can find some code that uses the Usage object from litellm, too.
This
usage
data, when available, provides live token counts. It will be available for some LLMs, and it will not be available for others.If these values could be stored and/or attached to the corresponding messages, these values could be calculated and returned in get_token_count() in llm.py and used to live-compute the token counts.
Read also codeact_agent.py file, to see how the LLM is used, and message.py to understand the messages. Read event.py, there will be multiple files with this name, that's okay, read them all, to see Events.
IMPORTANT:
First, read and review the files you need to understand how we can do this.
Then, decide on a solution, and do it.
Unit tests are required.
/TASK
ORIGINAL MESSAGE
The
usage
data, if filled, provides live token counts. It will work for some LLMs, and it will not be available for others.If these values could be stored and/or attached to the messages in the codeact_agent.py accordingly, these values could be calculated and returned in get_token_count() in llm.py.
Then this could help a running memory condenser in calculations?
The text was updated successfully, but these errors were encountered: