Skip to content

Commit

Permalink
OpenVINO integration for CausalLM models
Browse files Browse the repository at this point in the history
Signed-off-by: Helena <[email protected]>
  • Loading branch information
helena-intel committed Feb 2, 2024
1 parent b5f534a commit 16fc318
Show file tree
Hide file tree
Showing 3 changed files with 1,218 additions and 1,174 deletions.
8 changes: 4 additions & 4 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -160,9 +160,8 @@ COPY server/Makefile server/Makefile
# Install server
COPY proto proto
COPY server server
RUN cd server && \
make gen-server && \
pip install ".[accelerate]" --no-cache-dir
# RUN --mount=type=cache,target=/root/.cache/pip cd server && make gen-server && pip install ".[accelerate, openvino]"
RUN cd server && make gen-server && pip install ".[accelerate, openvino]" --no-cache-dir

# Patch codegen model changes into transformers 4.35
RUN cp server/transformers_patch/modeling_codegen.py ${SITE_PACKAGES}/transformers/models/codegen/modeling_codegen.py
Expand Down Expand Up @@ -311,7 +310,8 @@ RUN --mount=type=bind,from=auto-gptq-cache,src=/usr/src/auto-gptq-wheel,target=/
# Install server
COPY proto proto
COPY server server
RUN cd server && make gen-server && pip install ".[accelerate, onnx-gpu, quantize]" --no-cache-dir
# RUN --mount=type=cache,target=/root/.cache/pip cd server && make gen-server && pip install ".[accelerate, openvino]"
RUN cd server && make gen-server && pip install ".[accelerate, onnx-gpu, openvino, quantize]" --no-cache-dir

# Patch codegen model changes into transformers 4.35
RUN cp server/transformers_patch/modeling_codegen.py ${SITE_PACKAGES}/transformers/models/codegen/modeling_codegen.py
Expand Down
Loading

0 comments on commit 16fc318

Please sign in to comment.