-
Notifications
You must be signed in to change notification settings - Fork 191
Issues: modelscope/data-juicer
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
是否可以修改依赖中的transformers版本,怀疑下面报错为依赖问题
environment
related to third-party dependency, DJ-pypi, DJ-docker, etc.
question
Further information is requested
#524
opened Dec 26, 2024 by
baiyi-os
3 tasks done
Simplifying Open Source Contributions Through Operator Tiering from Dev aspect
dj:op
issues/PRs about some specific OPs
enhancement
New feature or request
good first issue
Good for newcomers
#510
opened Dec 11, 2024 by
yxdyc
2 tasks done
How to use Data-Juicer to process Chinese documents
question
Further information is requested
#509
opened Dec 11, 2024 by
aruig666
3 tasks done
Can the cleaning statistics be viewed after creating the config file and performing the cleaning?
question
Further information is requested
#499
opened Nov 27, 2024 by
Tendo33
3 tasks done
Guidance on Monitoring Task Execution with Ray Executor in Data Juicer
dj:dist
issues/PRs about distributed data processing
question
Further information is requested
#496
opened Nov 24, 2024 by
Fatima-0SA
3 tasks done
Merge local and API LLM calling
enhancement
New feature or request
#490
opened Nov 15, 2024 by
BeachWang
2 tasks done
Anyone tried DJ on multimodal datasets of more than 20M samples?
question
Further information is requested
#482
opened Nov 11, 2024 by
serser
3 tasks done
windows系统支持
question
Further information is requested
#477
opened Nov 6, 2024 by
zytcharming
3 tasks done
Update of Jupyter Notebooks
bug
Something isn't working
documentation
Improvements or additions to documentation
#476
opened Nov 6, 2024 by
HYLcool
[Bug]: perplexity_filter 算子内存OOM
bug
Something isn't working
#474
opened Nov 5, 2024 by
weiaicunzai
3 tasks done
How to calculate the image_text_similarity scores for both Chinese and English?
dj:multimodal
issues/PRs about multimodal data processing
dj:op
issues/PRs about some specific OPs
question
Further information is requested
#473
opened Nov 5, 2024 by
weiaicunzai
LLM造数据时需要try_num参数
enhancement
New feature or request
#470
opened Nov 4, 2024 by
BeachWang
2 tasks done
[Feat]: Unified LLM Calling Management
enhancement
New feature or request
#451
opened Oct 16, 2024 by
drcege
2 tasks done
[Feat]: Automatic Version Matching During Installation
enhancement
New feature or request
#450
opened Oct 16, 2024 by
drcege
2 tasks done
[Feat]: Enhance Unit Test Coverage for Python and CUDA Compatibility
enhancement
New feature or request
#449
opened Oct 16, 2024 by
drcege
2 tasks done
[Bug]: KeyError: 'resource'
bug
Something isn't working
#440
opened Sep 29, 2024 by
luckystar1992
3 tasks done
Require fps filter and mapper for videos
dj:op
issues/PRs about some specific OPs
enhancement
New feature or request
#433
opened Sep 23, 2024 by
BeachWang
[Feat] Support explicit issues/PRs about some specific OPs
enhancement
New feature or request
FusedOP
that allows for the configuration and application of multiple operators in smaller, manageable batches
dj:op
#413
opened Sep 2, 2024 by
yxdyc
2 tasks done
Guidance for OP with multiple data fields to be processed
enhancement
New feature or request
#411
opened Sep 2, 2024 by
yxdyc
2 tasks done
[Feat]: Add Ray actor support
dj:dist
issues/PRs about distributed data processing
enhancement
New feature or request
stale-issue
#371
opened Jul 29, 2024 by
drcege
support panda's student captioner model in our captioning mapper
dj:multimodal
issues/PRs about multimodal data processing
dj:op
issues/PRs about some specific OPs
enhancement
New feature or request
stale-issue
#251
opened Mar 14, 2024 by
yxdyc
ProTip!
Mix and match filters to narrow down what you’re looking for.