fix: Clean input before passing it to the llm #238

paul-paliychuk · 2024-12-11T00:37:24Z

Important

Adds _clean_input method to sanitize input strings in LLMClient and OpenAIClient, with tests for validation.

Behavior:
- Adds _clean_input method in client.py to remove invalid Unicode and control characters from input strings.
- Integrates _clean_input into generate_response in client.py and openai_client.py to sanitize message content before processing.
Testing:
- Adds test_client.py to test _clean_input with various cases, including control characters, zero-width characters, and invalid Unicode.
Misc:
- Updates version in pyproject.toml to 0.5.0pre5.

^{This description was created by}^{for e852493. It will automatically update as commits are pushed.}

ellipsis-dev

👍 Looks good to me! Incremental review on 96e6377 in 8 seconds

More details

Looked at 24 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_f5hQKesiGknn0f9n

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

❌ Changes requested. Reviewed everything up to f40ca8c in 2 minutes and 33 seconds

More details

Looked at 104 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. graphiti_core/llm_client/openai_client.py:91

Draft comment:
The _clean_input method is called twice for each message, once in generate_response and once in _generate_response. This is redundant. Consider removing one of the calls to optimize performance.
Reason this comment was not posted:
Confidence changes required: 80%
The _clean_input method is being called twice for each message, once in generate_response and once in _generate_response. This is redundant and can be optimized.

2. tests/llm_client/test_client.py:34

Draft comment:
Remove unnecessary comment as it repeats the information already present in the code.
Reason this comment was not posted:
Confidence changes required: 10%
The test cases in the test_clean_input function are well-structured and cover various scenarios. However, the comment on line 34 is not necessary as it repeats the information already present in the code.

Workflow ID: wflow_5X35jGYEohyhiscC

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

graphiti_core/llm_client/client.py

ellipsis-dev

👍 Looks good to me! Incremental review on d8fa84e in 14 seconds

More details

Looked at 41 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. graphiti_core/llm_client/client.py:59

Draft comment:
The retry logic was moved from above _clean_input to _generate_response_with_retry. Ensure this change is intentional and does not affect other parts of the code.
Reason this comment was not posted:
Confidence changes required: 50%
The _clean_input method is correctly implemented and integrated into the generate_response method. However, the retry logic was moved, which is not a problem but should be noted.

2. graphiti_core/llm_client/client.py:60

Draft comment:
The variable name input shadows the built-in Python function input(). Consider renaming it to something more descriptive, like input_text.
Reason this comment was not posted:
Confidence changes required: 80%
The code is mostly well-structured, but there are some issues with variable naming and DRY principle.

Workflow ID: wflow_rOqbjW6sty1EarQV

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

danielchalef

lgtm

ellipsis-dev

👍 Looks good to me! Incremental review on e852493 in 10 seconds

More details

Looked at 12 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. pyproject.toml:3

Draft comment:
The version update to "0.5.0pre5" is appropriate for the new feature addition. Ensure that this version is consistently used across all relevant documentation and deployment scripts.
Reason this comment was not posted:
Confidence changes required: 0%
The version update in the pyproject.toml file is consistent with the changes made in the PR, which is to add a new feature for input cleaning. This is a minor version update, which is appropriate for adding new functionality in a backward-compatible manner.

2. pyproject.toml:3

Draft comment:
Ensure that the versioning follows a consistent pattern across all files and documentation.
Reason this comment was not posted:
Confidence changes required: 33%
The pyproject.toml file looks well-structured and doesn't contain any obvious issues related to the rules provided. No secrets or sensitive data are present, and the file is formatted correctly.

Workflow ID: wflow_3GJlWPkSsoDm91bR

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

fix: Clean input before passing it to the llm

f40ca8c

paul-paliychuk requested review from danielchalef and prasmussen15 December 11, 2024 00:37

paul-paliychuk temporarily deployed to development December 11, 2024 00:37 — with GitHub Actions Inactive

chore: Add license

96e6377

paul-paliychuk temporarily deployed to development December 11, 2024 00:39 — with GitHub Actions Inactive

ellipsis-dev bot reviewed Dec 11, 2024

View reviewed changes

graphiti_core/llm_client/client.py Show resolved Hide resolved

fix: typo

d8fa84e

paul-paliychuk temporarily deployed to development December 11, 2024 00:41 — with GitHub Actions Inactive

ellipsis-dev bot reviewed Dec 11, 2024

View reviewed changes

danielchalef approved these changes Dec 11, 2024

View reviewed changes

chore: Bump graphiti version

e852493

paul-paliychuk temporarily deployed to development December 11, 2024 02:24 — with GitHub Actions Inactive

ellipsis-dev bot reviewed Dec 11, 2024

View reviewed changes

paul-paliychuk merged commit a9091b0 into main Dec 11, 2024
7 checks passed

paul-paliychuk deleted the paul/zepai-847-clean-llm-inputs-to-avoid-invalid-unicode-or-control branch December 11, 2024 02:27

github-actions bot locked and limited conversation to collaborators Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Clean input before passing it to the llm #238

fix: Clean input before passing it to the llm #238

paul-paliychuk commented Dec 11, 2024 •

edited by ellipsis-dev bot

Loading

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

danielchalef left a comment

ellipsis-dev bot left a comment

fix: Clean input before passing it to the llm #238

fix: Clean input before passing it to the llm #238

Conversation

paul-paliychuk commented Dec 11, 2024 • edited by ellipsis-dev bot Loading

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

danielchalef left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

paul-paliychuk commented Dec 11, 2024 •

edited by ellipsis-dev bot

Loading