Releases: xorbitsai/inference
v0.15.2
What's new in 0.15.2 (2024-09-20)
These are the changes in inference v0.15.2.
New features
- FEAT: Support Qwen 2.5 by @Jun-Howie in #2325
- FEAT: support qwen2.5-coder-instruct and qwen2.5 sglang by @amumu96 in #2332
Bug fixes
- BUG: [UI] Fix registration page bug. by @yiboyasss in #2315
- BUG: Fix CosyVoice missing output by @codingl2k1 in #2320
- BUG: support old register llm format by @amumu96 in #2335
- BUG: fix stable diffusion from dify tool by @qinxuye in #2336
Documentation
Full Changelog: v0.15.1...v0.15.2
v0.15.1
What's new in 0.15.1 (2024-09-14)
These are the changes in inference v0.15.1.
New features
- FEAT: Support qwen2-vl-instruct GPTQ format and AWQ format by @Jun-Howie in #2251
- FEAT: Support minicpm-4B by @Jun-Howie in #2263
- FEAT: support sdapi/txt2img by @qinxuye in #2248
- FEAT: [UI] Auto-fill chat_template parameter on registration page. by @yiboyasss in #2268
- FEAT: support sdapi/sd-models and sdapi/samplers by @qinxuye in #2288
- FEAT: support deepseek-v2 and 2.5 by @amumu96 in #2292
- FEAT: Update Qwen2-VL-Model to support flash_attention_2 implementation by @LaureatePoet in #2289
- FEAT: support sdapi/img2img by @qinxuye in #2293
- FEAT: support flux.1 image2image and inpainting by @qinxuye in #2296
- FEAT: Support yi-coder-chat by @Jun-Howie in #2302
- FEAT: qwen2 audio by @codingl2k1 in #2271
Enhancements
- ENH: Update CosyVoice Huggingface by @codingl2k1 in #2249
- ENH: Supports multi functions in tool call for qwen2 by @ChengjieLi28 in #2265
- ENH: add
print-error
option in benchmark by @Dawnfz-Lenfeng in #2283 - ENH: Support fish speech 1.4 by @codingl2k1 in #2295
Bug fixes
- BUG: tts stream mode not working by @leslie2046 in #2279
- BUG: fix issue with model launch failing when .safetensors file is missing (#2094) by @Charmnut in #2290
- BUG: fix sampler_name for img2img by @qinxuye in #2301
- BUG: modify vllm image version by @amumu96 in #2311
- Bug: modify vllm image version by @amumu96 in #2312
Documentation
New Contributors
- @Jun-Howie made their first contribution in #2251
- @leslie2046 made their first contribution in #2279
- @Charmnut made their first contribution in #2290
- @LaureatePoet made their first contribution in #2289
Full Changelog: v0.15.0...v0.15.1
v0.15.0
What's new in 0.15.0 (2024-09-06)
These are the changes in inference v0.15.0.
New features
- FEAT: cosyvoice model support streaming reply by @wuminghui-coder in #2192
- FEAT: support qwen2-vl-instruct by @Minamiyama in #2205
Enhancements
- ENH: include openai-whisper into thirdparty by @qinxuye in #2232
- ENH:
MiniCPM-V-2.6
Supports continuous batching with transformers engine by @ChengjieLi28 in #2238 - ENH: unpad for image2image/inpainting model by @wxiwnd in #2229
- ENH: Refine request log and add optional request_id by @frostyplanet in #2173
- REF: Use
chat_template
for LLM instead ofprompt_style
by @ChengjieLi28 in #2193
Bug fixes
- BUG: Fix docker image startup issue due to entrypoint by @ChengjieLi28 in #2207
- BUG: fix init xinference fail when custom path is fault by @amumu96 in #2208
- BUG: use
default_uid
to replaceuid
of actors which may override the xoscar actor's uid property by @qinxuye in #2214 - BUG: fix rerank max length by @qinxuye in #2219
- BUG: logger bug of function using generator decoration by @wxiwnd in #2215
- BUG: fix rerank calculation of tokens number by @qinxuye in #2228
- BUG: fix embedding token calculation & optimize memory by @qinxuye in #2221
Documentation
- DOC: Modify the installation documentation to change single quotes to double quotes for Windows compatibility. by @nikelius in #2211
Others
- Revert "EHN: clean cache for VL models (#2163)" by @qinxuye in #2230
- CHORE: Docker image is only pushed to aliyun when releasing version by @ChengjieLi28 in #2216
- CHORE: Compatible with
openai >= 1.40
by @ChengjieLi28 in #2231
New Contributors
- @nikelius made their first contribution in #2211
- @wuminghui-coder made their first contribution in #2192
Full Changelog: v0.14.4...v0.15.0
v0.14.4.post1
What's new in 0.14.4.post1 (2024-09-03)
These are the changes in inference v0.14.4.post1.
Bug fixes
- BUG: Fix docker image startup issue due to entrypoint by @ChengjieLi28 in #2207
- BUG: fix init xinference fail when custom path is fault by @amumu96 in #2208
Documentation
- DOC: Modify the installation documentation to change single quotes to double quotes for Windows compatibility. by @nikelius in #2211
Others
- CHORE: Docker image is only pushed to aliyun when releasing version by @ChengjieLi28 in #2216
New Contributors
Full Changelog: v0.14.4...v0.14.4.post1
v0.14.4
What's new in 0.14.4 (2024-08-30)
These are the changes in inference v0.14.4.
New features
Enhancements
- ENH: support padding for sd inpainting model by @wxiwnd in #2165
- ENH: Move matcha to third party by @codingl2k1 in #2166
- ENH: Fix callback status when model die by @frostyplanet in #2172
- ENH: support cosyvoice-300m-instruct without instruction by @qinxuye in #2175
- ENH: Remove opencc and fast_whisper by @codingl2k1 in #2179
- ENH: solve the problem of health check for image-to-text model by @luhairong11 in #2182
Bug fixes
- BUG: docker compose failed due to empty entrypoint by @ChengjieLi28 in #2180
- BUG: 🐛 fix unable launch qwen2-embedding by @Zzzz1111 in #2185
- BUG: configuration key different by sglang's version by @lordk911 in #2188
- BUG: fix lora not load in transformers engine by @amumu96 in #2194
- BUG: fix register model list error by @amumu96 in #2189
- BUG: Fix list video model by @codingl2k1 in #2190
- BUG: fix custom path test error by @amumu96 in #2200
Others
- EHN: clean cache for VL models by @qinxuye in #2163
- CHORE: Clean test env by @codingl2k1 in #2183
New Contributors
- @luhairong11 made their first contribution in #2182
- @lordk911 made their first contribution in #2188
Full Changelog: v0.14.3...v0.14.4
v0.14.3
What's new in 0.14.3 (2024-08-25)
These are the changes in inference v0.14.3.
New features
- FEAT: ChatTTS speech voice support encoded speaker str by @codingl2k1 in #2096
- FEAT: [UI] Add other parameters to other models besides the LLM model. by @yiboyasss in #2129
- FEAT: support SD3-medium inpainting by @qinxuye in #2137
- feat: 🎸 Added the model dtype parameter for embedding (currently only supported for models gte-Qwen2). by @Zzzz1111 in #2120
- FEAT: Support fish speech model by @codingl2k1 in #2119
- FEAT: support CogVLM2-video by @Minamiyama in #2110
- FEAT: Support LMDeploy for internvl2 and fix finish reasion miss at internvl stream by @amumu96 in #2145
Enhancements
- ENH: make internvl2 support video by @Minamiyama in #2104
- ENH: support process_image with padding for image_to_image by @qinxuye in #2109
- REF: use utils._decode_image replacing same codes in individual vl files by @Minamiyama in #2105
Bug fixes
- BUG: fix asyncio.Queue error in benchmark by @Dawnfz-Lenfeng in #2113
- BUG: fix deleting cache by @qinxuye in #2114
- Bug: fix audio ability errorwhen get instance info by @amumu96 in #2147
- BUG: fix docker conflict by @amumu96 in #2156
Documentation
- DOC: add more doc about ChatTTS by @qinxuye in #2108
- DOC: remove models deleted by @qinxuye in #2122
- DOC: Add doc for fish speech and cogvlm2 video by @codingl2k1 in #2149
Others
- [Bug]: Fix concurrent ops in worker initialization by @frostyplanet in #2125
New Contributors
Full Changelog: v0.14.2...v0.14.3
v0.14.2
What's new in 0.14.2 (2024-08-16)
These are the changes in inference v0.14.2.
New features
- FEAT: add gemma-2-it 2b & internlm2.5-chat 1.8b and 20b & update video and sglang docs by @qinxuye in #2080
- FEAT: support FP8 for vllm & sglang engine by @qinxuye in #2069
- Feat: Support internvl2 and internvl stream by @amumu96 in #2079
Enhancements
- ENH: make MiniCPM v2.6 support video by @Minamiyama in #2068
- REF: Remove some builtin old models and
ggmlv3
model format by @ChengjieLi28 in #2086
Bug fixes
- BUG: limit AutoAWQ version to fix docker issue by @qinxuye in #2067
- BUG: Fix custom glm4 & remove tool calls of ChatGLM3 by @codingl2k1 in #2081
- BUG: Infinited loop with login by @WalkerWang731 in #2039
Documentation
New Contributors
- @WalkerWang731 made their first contribution in #2039
Full Changelog: v0.14.1...v0.14.2
v0.14.1.post1
What's new in 0.14.1.post1 (2024-08-13)
These are the changes in inference v0.14.1.post1.
Bug fixes
Documentation
Full Changelog: v0.14.1...v0.14.1.post1
v0.14.1
What's new in 0.14.1 (2024-08-09)
These are the changes in inference v0.14.1.
New features
- FEAT: support SenseVoice audio-to-text model by @qinxuye in #2008
- FEAT: support flux.1-schnell & flux.1-dev by @qinxuye in #2007
- FEAT: support kolors image model by @qinxuye in #2028
- FEAT: Add support for llama-3.1-instruct 405B model by @frostyplanet in #2025
- FEAT: Support CogVideoX video model by @codingl2k1 in #2049
- FEAT: Support MiniCPM-v-2_6 by @Minamiyama in #2031
Enhancements
- ENH: Improve internal server error by @codingl2k1 in #2009
- ENH: Add
stream
option in Benchmark by @Dawnfz-Lenfeng in #2038 - ENH: optimize availability of vLLM by @qinxuye in #2046
- ENH: [worker] Allow init supervisor_ref lazy by @frostyplanet in #1958
- ENH: optimize performance of sglang by @qinxuye in #2050
- REF: Mark
Deprecate
forprompt
,system_prompt
andchat_history
parameters inchat
client interface by @ChengjieLi28 in #2043
Bug fixes
- BUG: fix flexible model register in worker by @frostyplanet in #2011
- BUG: [UI] Fix the 'model_path' bug. by @yiboyasss in #2015
- BUG: fix custom embedding launch error by @amumu96 in #2016
Tests
- TST: Fix some dependency version issues by @ChengjieLi28 in #2042
Documentation
- DOC: Directly launch custom model by
model_path
by @ChengjieLi28 in #2047 - DOC: fix typo in README by @ArtificialZeng in #2048
Others
- CHORE: Increased frequency of issue processing by @ChengjieLi28 in #2024
New Contributors
- @ArtificialZeng made their first contribution in #2048
- @Dawnfz-Lenfeng made their first contribution in #2038
Full Changelog: v0.14.0...v0.14.1
v0.14.0.post1
What's new in 0.14.0.post1 (2024-08-05)
These are the changes in inference v0.14.0.post1.
Enhancements
- ENH: Improve internal server error by @codingl2k1 in #2009
Bug fixes
- BUG: fix flexible model register in worker by @frostyplanet in #2011
- BUG: [UI] Fix the 'model_path' bug. by @yiboyasss in #2015
- BUG: fix custom embedding launch error by @amumu96 in #2016
Full Changelog: v0.14.0...v0.14.0.post1