Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV Agent seems not to be compatible with utf8 #5185

Open
flefevre opened this issue Dec 10, 2024 · 3 comments
Open

CSV Agent seems not to be compatible with utf8 #5185

flefevre opened this issue Dec 10, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@flefevre
Copy link

Bug Description

When uploading a basic CVS file in a flow with a CSV agent, the execution raises an exception due to UTF-8 encoding

Error building Component CSVAgent: 'utf-8' codec can't decode byte 0xe9 in position 7: invalid continuation byte

`
Error building Component CSVAgent:

'utf-8' codec can't decode byte 0xe9 in position 7: invalid continuation byte

Traceback (most recent call last):
File "/app/.venv/lib/python3.12/site-packages/langflow/graph/vertex/base.py", line 709, in _build_results
result = await initialize.loading.get_instance_results(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/langflow/interface/initialize/loading.py", line 68, in get_instance_results
return await build_component(params=custom_params, custom_component=custom_component)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/langflow/interface/initialize/loading.py", line 145, in build_component
build_results, artifacts = await custom_component.build_results()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/langflow/custom/custom_component/component.py", line 837, in build_results
return await self._build_with_tracing()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/langflow/custom/custom_component/component.py", line 819, in _build_with_tracing
_results, _artifacts = await self._build_results()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/langflow/custom/custom_component/component.py", line 885, in _build_results
result = await asyncio.to_thread(method)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/asyncio/threads.py", line 25, in to_thread
return await loop.run_in_executor(None, func_call)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 65, in build_agent_response
File "/app/.venv/lib/python3.12/site-packages/langchain_experimental/agents/agent_toolkits/csv/base.py", line 57, in create_csv_agent
df = pd.read_csv(path, **_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
return _read(filepath_or_buffer, kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 620, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1620, in init
self._engine = self._make_engine(f, self.engine)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1898, in _make_engine
return mapping[engine](f, **self.options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in init
self._reader = parsers.TextReader(src, **kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "parsers.pyx", line 574, in pandas._libs.parsers.TextReader.cinit
File "parsers.pyx", line 663, in pandas._libs.parsers.TextReader._get_header
File "parsers.pyx", line 874, in pandas._libs.parsers.TextReader._tokenize_rows
File "parsers.pyx", line 891, in pandas._libs.parsers.TextReader._check_tokenize_status
File "parsers.pyx", line 2053, in pandas._libs.parsers.raise_parser_error
File "", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 7: invalid continuation byte

`

Reproduction

Create a basic flow
Add a CSV agent
Add a language llm such as mistral small
Add a csv file generated from excel

Expected behavior

A basic CVS agent should be able to take of a basic csv file

Who can help?

No response

Operating System

Ubuntu 22.04

Langflow Version

1.1.1

Python Version

None

Screenshot

No response

Flow File

CSVAgent.json

@flefevre flefevre added the bug Something isn't working label Dec 10, 2024
Copy link

dosubot bot commented Dec 10, 2024

The error you're encountering is due to the CSV file not being encoded in UTF-8, which is the default encoding used by the CSVAgent in Langflow. To resolve this issue, you can specify the correct encoding of your CSV file using the pandas_kwargs parameter when initializing the CSVAgent. This allows you to pass additional keyword arguments to the pandas.read_csv function, including the encoding argument.

Here's how you can specify a different encoding:

CSVAgent.from_toolkit_and_llm(
    path="your_file.csv",
    llm=your_llm_instance,
    pandas_kwargs={"encoding": "latin1"}  # or another appropriate encoding
)

Replace "your_file.csv" with the path to your CSV file, your_llm_instance with your language model instance, and "latin1" with the correct encoding for your file. Common encodings include "utf-8", "latin1", "iso-8859-1", etc. [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@flefevre
Copy link
Author

The pandas_kwargs is not accessible from the user interface.
it will be usefull if this was the case.
Dear Langflow team, could you expose the pandas_kwargs as parameter in the CVS agent?

@Cristhianzl
Copy link
Member

hi @flefevre how are you?

Thanks for your suggestion.
You can follow up this improvement on this PR: #5372

This field will be available on Controls panel of the Component.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants