Skip to content

OpenVINO Model Server 2025.2.1

Compare
Choose a tag to compare
@atobiszei atobiszei released this 16 Jul 13:49
· 4 commits to releases/2025/2 since this release
b7e0a09

The 2025.2.1 is a minor release with bug fixes and improvements, mainly in automatic model pulling and image generation.

Improvements:

Other changes:

  • Changed NPU driver version from 1.17 to 1.19 in docker images
  • Security related updates in dependencies

Bug fixes:

  • Removed limitation for Image generation - now it supports requesting several output images with parameter n
  • add_to_config and remove_from_config parameters accept path to configuration file in addition to directory containing config.json file
  • Resolved connectivity issues while pulling models from HuggingFace Hub without proxy configuration
  • Fixed handling HF_ENDPOINT environment variable with HTTP addresses as previously https:// prefix was incorrectly added.
  • Changed pull feature environment variables GIT_SERVER_CONNECT_TIMEOUT_MS to GIT_OPT_SET_SERVER_TIMEOUT and GIT_SERVER_TIMEOUT_MS to GIT_OPT_SET_SERVER_TIMEOUT to unify with underlying libgit2 implementation.
  • Fixed handling relative paths on Windows with MediaPipes/LLMs for config_path parameter.
  • Fixed agentic demo not working without proxy
  • Stop rejecting response_format field in image generation. While parameter accepts now only base64_json value, it allows to integrate with Open WebUI
  • Add missing --response_parser parameter when using OVMS to pull LLM's model and prepare its configuration
  • Block simultaneous use of --list_models and --pull parameters as they are exclusive.
  • Fixed accuracy for the Phi4-mini model response parser while using functions with lists as arguments
  • export_model.py script fix for handling target_device for embeddings and reranking models
  • stateful text generation pipeline do not include usage content - it is not supported for such pipeline type. Before it was returning incorrect response.

Known issues and limitations

  • VLM models QwenVL2, QwenVL2.5, and Phi3_VL have lower accuracy when deployed on CPU in a text generation pipeline with continuous batching. It is recommended to deploy these models in a stateful pipeline which processes the requests sequentially like in the demo
  • Using NPU for image generation endpoints is unsupported in this release.

You can use an OpenVINO Model Server public docker images based on Ubuntu via the following command:

docker pull openvino/model_server:2025.2.1- CPU device support with image based on Ubuntu24.04
docker pull openvino/model_server:2025.2.1-gpu - GPU, NPU and CPU device support with image based on Ubuntu 24.04
or use provided binary packages. Only packages with suffix _python_on have support for python.

Check the instructions how to install the binary package
The prebuilt image is also available on RedHat Ecosystem Catalog