Skip to content

Simple-MLLM is a lightweight locally deployed Multimodal Large Model (MLLM) practice. Let's get your hands on the magic of MLLM, just on your local machine!Simple-MLLM是一个简易的在本地部署的多模态大模型(MLLM)实践,一起来上手感受MLLM的魔力,仅仅在你的本机上!

License

Notifications You must be signed in to change notification settings

Gnonymous/Simple-MLLM

Repository files navigation

Simple-MLLM

Simple-MLLM is a simple locally deployed Multimodal Large Model (MLLM) practice. Let's get your hands on the magic of MLLM, just on your local machine!

😎这是一个在本地部署的简易多模态大模型(MLLM)的实例,支持包括图片、文字以及语音(正在更新)多种模态的输入。

🤩如果你对多模态大模型(MLLM)感兴趣但是不知如何上手,来看看这个项目:一个简易的项目让你感受到多模态大模型的魔力👍


😎This is a simple example of deploying Multimodal Large Model (MLLM) locally with support for multiple modal inputs including image, text and voice (being updated).

🤩If you are interested in Multimodal Large Models (MLLM) but don't know how to get started. Take a look at this project: a simple project that will give you a taste of the magic of MLLM 👍

Overview

The input is img as below:

alt img

And then I ask the model:"描述”, after that the checkpoints are loaded. Finally:

The output is description as below:

alt img

Based Structure:

  • Socket
  • Qwen2-VL-3B
  • Whisper(being updated

About

Simple-MLLM is a lightweight locally deployed Multimodal Large Model (MLLM) practice. Let's get your hands on the magic of MLLM, just on your local machine!Simple-MLLM是一个简易的在本地部署的多模态大模型(MLLM)实践,一起来上手感受MLLM的魔力,仅仅在你的本机上!

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published