update qwen2vl to use optimum intel

openvinotoolkit · Dec 24, 2024 · f4914a9 · f4914a9
1 parent f5d2fca
commit f4914a9
Show file tree

Hide file tree

Showing 4 changed files with 139 additions and 360 deletions.
diff --git a/.ci/spellcheck/.pyspelling.wordlist.txt b/.ci/spellcheck/.pyspelling.wordlist.txt
@@ -629,6 +629,7 @@ OV
 OVC
 OVModel
 OVModelForCausalLM
+OVModelForVisualCausalLM
 OVModelForXXX
 OVModelForXxx
 OVMS

diff --git a/notebooks/phi-3-vision/phi-3-vision.ipynb b/notebooks/phi-3-vision/phi-3-vision.ipynb
@@ -166,9 +166,9 @@
     "## Convert and Optimize model\n",
     "[back to top ⬆️](#Table-of-contents:)\n",
     "\n",
-    "Phi-3-vision is PyTorch model. OpenVINO supports PyTorch models via conversion to OpenVINO Intermediate Representation (IR). [OpenVINO model conversion API](https://docs.openvino.ai/2024/openvino-workflow/model-preparation.html#convert-a-model-with-python-convert-model) should be used for these purposes. `ov.convert_model` function accepts original PyTorch model instance and example input for tracing and returns `ov.Model` representing this model in OpenVINO framework. Converted model can be used for saving on disk using `ov.save_model` function or directly loading on device using `core.complie_model`. \n",
+    "Phi-3-vision is PyTorch model. OpenVINO supports PyTorch models via conversion to OpenVINO Intermediate Representation (IR). [OpenVINO model conversion API](https://docs.openvino.ai/2024/openvino-workflow/model-preparation.html#convert-a-model-with-python-convert-model) should be used for these purposes. `ov.convert_model` function accepts original PyTorch model instance and example input for tracing and returns `ov.Model` representing this model in OpenVINO framework. Converted model can be used for saving on disk using `ov.save_model` function or directly loading on device using `core.compile_model`. \n",
     "\n",
-    "OpenVINO supports PyTorch models via conversion to OpenVINO Intermediate Representation format.  For convenience, we will use OpenVINO integration with HuggingFace Optimum. 🤗 [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) is the interface between the 🤗 Transformers and Diffusers libraries and the different tools and libraries provided by Intel to accelerate end-to-end pipelines on Intel architectures.\n",
+    "For convenience, we will use OpenVINO integration with HuggingFace Optimum. 🤗 [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) is the interface between the 🤗 Transformers and Diffusers libraries and the different tools and libraries provided by Intel to accelerate end-to-end pipelines on Intel architectures.\n",
     "\n",
     "Among other use cases, Optimum Intel provides a simple interface to optimize your Transformers and Diffusers models, convert them to the OpenVINO Intermediate Representation (IR) format and run inference using OpenVINO Runtime. `optimum-cli` provides command line interface for model conversion and optimization. \n",
     "\n",

diff --git a/notebooks/qwen2-vl/README.md b/notebooks/qwen2-vl/README.md
@@ -27,7 +27,7 @@ Qwen2VL is the latest addition to the QwenVL series of multimodal large language
 
 More details about model can be found in [model card](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct), [blog](https://qwenlm.github.io/blog/qwen2-vl/) and original [repo](https://github.com/QwenLM/Qwen2-VL).
 
-In this tutorial we consider how to convert and optimize Qwen2VL model for creating multimodal chatbot. Additionally, we demonstrate how to apply stateful transformation on LLM part and model optimization techniques like weights compression using [NNCF](https://github.com/openvinotoolkit/nncf)
+In this tutorial we consider how to convert and optimize Qwen2VL model for creating multimodal chatbot using [Optimum Intel](https://github.com/huggingface/optimum-intel). Additionally, we demonstrate how to apply model optimization techniques like weights compression using [NNCF](https://github.com/openvinotoolkit/nncf)
 
 ## Notebook contents
 The tutorial consists from following steps: