GitHub - devolt5/MiniCPM-V: MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Attention
This is a customized repo with some adjustements to work in Windows Enviroment.
Attention

I've made the following changes:

Update requirements to work seamlessly
Add main.py which is basically the example code but with some bugfixes (including an OCR example in assets)
Added flash_attn_whl which contains a build wheel (splitted with tar) for flash_attn installation (which takes > 4 hours of compiling)

Troubleshooting

Problem: flash_attn is needed

Solution: Install flash_attn, but newest version

Problem: flash_attn needs pytorch with cuda

Solution: If you also don't like conda, first, get rid of conda and use venv with:
python -m pip uninstall torch
python -m pip cache purge
python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
Note: This is for CUDA compiler driver 12.4. Adjust to your system wide cuda. Check with nvcc --version

Problem: Unrecognized configuration class to build an AutoTokenizer

Solution: Add huggingface token to tokenizer

Problem: "'MiniCPMVTokenizerFast' object has no attribute 'image_processor'"

Solution: Use AutoProcessor.from_pretrained() and the model itself. Maybe not the optimal performance solution.

Name		Name	Last commit message	Last commit date
Latest commit History 449 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
.vscode		.vscode
assets		assets
docs		docs
eval_mm		eval_mm
finetune		finetune
flash_attn_whl		flash_attn_whl
omnilmm		omnilmm
quantize		quantize
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_en.md		README_en.md
README_zh.md		README_zh.md
chat.py		chat.py
check_cuda.py		check_cuda.py
main.py		main.py
minicpm_v1.md		minicpm_v1.md
omnilmm.md		omnilmm.md
omnilmm_en.md		omnilmm_en.md
requirements.txt		requirements.txt
web_demo.py		web_demo.py
web_demo_2.5.py		web_demo_2.5.py
web_demo_2.6.py		web_demo_2.6.py
web_demo_streamlit-2_5.py		web_demo_streamlit-2_5.py
web_demo_streamlit-minicpmv2_6.py		web_demo_streamlit-minicpmv2_6.py
web_demo_streamlit.py		web_demo_streamlit.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Troubleshooting

Problem: flash_attn is needed

Problem: flash_attn needs pytorch with cuda

Problem: Unrecognized configuration class to build an AutoTokenizer

Problem: "'MiniCPMVTokenizerFast' object has no attribute 'image_processor'"

About

Releases

Packages

Languages

License

devolt5/MiniCPM-V

Folders and files

Latest commit

History

Repository files navigation

Troubleshooting

Problem: flash_attn is needed

Problem: flash_attn needs pytorch with cuda

Problem: Unrecognized configuration class to build an AutoTokenizer

Problem: "'MiniCPMVTokenizerFast' object has no attribute 'image_processor'"

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages