You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2024-12-19 10:32:00,341 - datasets - INFO - PyTorch version 2.5.1 available.
2024-12-19 10:32:00,341 - datasets - INFO - Polars version 1.17.1 available.
2024-12-19 10:32:01,264 - evalscope - INFO - Args: Task config is provided with dictionary type.
2024-12-19 10:32:01,269 - evalscope - INFO - Dump task config to ./outputs/20241219_103201/configs/task_config_190d29.yaml
2024-12-19 10:32:01,270 - evalscope - INFO - {
"model": null,
"model_id": null,
"model_args": {
"revision": "master",
"precision": "torch.float16",
"device": "auto"
},
"template_type": null,
"chat_template": null,
"datasets": null,
"dataset_args": {},
"dataset_dir": "/home/wuchen/.cache/modelscope/datasets",
"dataset_hub": "modelscope",
"generation_config": {
"max_length": 2048,
"max_new_tokens": 512,
"do_sample": false,
"top_k": 50,
"top_p": 1.0,
"temperature": 1.0
},
"eval_type": "checkpoint",
"eval_backend": "RAGEval",
"eval_config": {
"tool": "RAGAS",
"testset_generation": {
"docs": [
"/home/wuchen/gridqa/data/raw/demo/pdf/1.pdf",
"/home/wuchen/gridqa/data/raw/demo/pdf/2.pdf",
"/home/wuchen/gridqa/data/raw/demo/pdf/3.pdf"
],
"test_size": 5,
"output_file": "outputs/testset.json",
"knowledge_graph": "outputs/knowledge_graph.json",
"distribution": {
"simple": 0.7,
"multi_context": 0.2,
"reasoning": 0.1
},
"generator_llm": {
"api_base": "******************",
"api_key": "********************************************"
},
"embeddings": {
"model_name_or_path": "/home/wuchen/models/BAAI/bge-large-zh-v1___5"
},
"language": "chinese"
}
},
"stage": "all",
"limit": null,
"mem_cache": false,
"use_cache": null,
"work_dir": "./outputs/20241219_103201",
"outputs": null,
"debug": false,
"dry_run": false,
"seed": 42
}
2024-12-19 10:32:01,729 - evalscope - INFO - Check `ragas` Installed
/home/wuchen/anaconda3/envs/evalscope/lib/python3.10/site-packages/evalscope/backend/rag_eval/ragas/tasks/testset_generation.py:72: LangChainDeprecationWarning: The class `UnstructuredFileLoader` was deprecated in LangChain 0.2.8 and will be removed in 1.0. An updated version of the class exists in the :class:`~langchain-unstructured package and should be used instead. To use it run `pip install -U :class:`~langchain-unstructured` and import as `from :class:`~langchain_unstructured import UnstructuredLoader``.
loader = UnstructuredFileLoader(file_path, mode='single')
2024-12-19 10:32:15,340 - pikepdf._core - INFO - pikepdf C++ to Python logger bridge initialized
2024-12-19 10:37:05,288 - sentence_transformers.SentenceTransformer - INFO - Use pytorch device_name: cuda
2024-12-19 10:37:05,288 - sentence_transformers.SentenceTransformer - INFO - Load pretrained SentenceTransformer: /home/wuchen/models/BAAI/bge-large-zh-v1___5
Traceback (most recent call last):
File "/home/wuchen/gridqa/tests/eval_rag_gen.py", line 29, in <module>
run_task(task_cfg=generate_testset_task_cfg)
File "/home/wuchen/anaconda3/envs/evalscope/lib/python3.10/site-packages/evalscope/run.py", line 36, in run_task
return run_single_task(task_cfg, run_time)
File "/home/wuchen/anaconda3/envs/evalscope/lib/python3.10/site-packages/evalscope/run.py", line 49, in run_single_task
return run_non_native_backend(task_cfg)
File "/home/wuchen/anaconda3/envs/evalscope/lib/python3.10/site-packages/evalscope/run.py", line 81, in run_non_native_backend
backend_manager.run()
File "/home/wuchen/anaconda3/envs/evalscope/lib/python3.10/site-packages/evalscope/backend/rag_eval/backend_manager.py", line 71, in run
self.run_ragas(testset_args, eval_args)
File "/home/wuchen/anaconda3/envs/evalscope/lib/python3.10/site-packages/evalscope/backend/rag_eval/backend_manager.py", line 50, in run_ragas
generate_testset(TestsetGenerationArguments(**testset_args))
File "/home/wuchen/anaconda3/envs/evalscope/lib/python3.10/site-packages/evalscope/backend/rag_eval/ragas/tasks/testset_generation.py", line 93, in generate_testset
transforms = default_transforms(
File "/home/wuchen/anaconda3/envs/evalscope/lib/python3.10/site-packages/evalscope/backend/rag_eval/ragas/tasks/build_transform.py", line 133, in default_transforms
raise ValueError('Documents appears to be too short (ie 100 tokens or less). Please provide longer documents.')
ValueError: Documents appears to be too short (ie 100 tokens or less). Please provide longer documents.
运行环境 / Runtime Environment
操作系统 / Operating System:
Windows
macOS
Ubuntu
Python版本 / Python Version:
3.11
3.10
3.9
其他信息 / Additional Information
evalscope 0.8.1
ragas 0.2.7
langchain 0.3.13
The text was updated successfully, but these errors were encountered:
问题描述 / Issue Description
生成问题时报错 Documents appears to be too short (ie 100 tokens or less)
pdf文档为中文的资料文档,大小1M左右,页数在30~60之间。
执行的代码或指令 / Code or Commands Executed
错误日志 / Error Log
运行环境 / Runtime Environment
操作系统 / Operating System:
Python版本 / Python Version:
其他信息 / Additional Information
evalscope 0.8.1
ragas 0.2.7
langchain 0.3.13
The text was updated successfully, but these errors were encountered: