Skip to content

Conversation

@BrewTestBot
Copy link
Member

Created by brew bump


Created with brew bump-formula-pr.

release notes




Welcome to LocalAI 3.8.0 !

LocalAI 3.8.0 focuses on smoothing out the user experience and exposing more power to the user without requiring restarts or complex configuration files. This release introduces a new onboarding flow and a universal model loader that handles everything from HF URLs to local files.

We’ve also improved the chat interface, addressed long-standing requests regarding OpenAI API compatibility (specifically SSE streaming standards) and exposed more granular controls for some backends (llama.cpp) and backend management.

📌 TL;DR

Feature Summary
Universal Model Import Import directly from Hugging Face, Ollama, OCI, or local paths. Auto-detects backends and handles chat templates.
UI & Index Overhaul New onboarding wizard, auto-model selection on boot, and a cleaner tabular view for model management.
MCP Live Streaming New: Agent actions and tool calls are now streamed live via the Model Context Protocol—see reasoning in real-time.
Hot-Reloadable Settings Modify watchdogs, API keys, P2P settings, and defaults without restarting the container.
Chat enhancements Chat history and parallel conversations are now persisted in local storage.
Strict SSE Compliance Fixed streaming format to exactly match OpenAI specs (resolves issues with LangChain/JS clients).
Advanced Config Fine-tune context_shift, cache_ram, and parallel workers via YAML options.
Logprobs & Logitbias Added token-level probability support for improved agent/eval workflows.

Feature Breakdown

🚀 Universal Model Import (URL-based)

We have refactored how models are imported. You no longer need to manually write configuration files for common use cases. The new importer accepts URLs from Hugging Face, Ollama, and OCI registries, or local file paths also from the Web interface.

import.mp4
  • Auto-Detection: The system attempts to identify the correct backend (e.g., llama.cpp vs diffusers) and applies native chat templates (e.g., llama-3, mistral) automatically by reading the model metadata.
  • Customization during Import: You can override defaults immediately, for example, forcing a specific quantization on a GGUF file or selecting vLLM over transformers.
  • Multimodal Support: Vision components (mmproj) are detected and configured automatically.
  • File Safety: We added a safeguard to prevent the deletion of model files (blobs) if they are shared by multiple model configurations.

🎨 Complete UI Overhaul

The web interface has been redesigned for better usability and clearer management.

index.mp4
  • Onboarding Wizard: A guided flow helps first-time users import or install a model in under 30 seconds.
  • Auto-Focus & Selection: The input field captures focus automatically, and a default model is loaded on startup so you don't start in a "no model selected" state.
  • Tabular Management: Models and backends are now organized in a cleaner list view, making it easier to see what is installed.
manage.mp4

🤖 Agentic Ecosystem & MCP Live Streaming

LocalAI 3.8.0 significantly upgrades support for agentic workflows using the Model Context Protocol (MCP).

  • Live Action Streaming: We have added a new endpoint to stream agent results as they happen. Instead of waiting for the final output, you can now watch the agent "think": seeing tool calls, reasoning steps, and intermediate actions streamed live in the UI.
mcp.mp4

Configuring MCP via the interface is now simplified:

mcp_configuration.mp4

🔁 Runtime System Settings

A new Settings > System panel exposes configuration options that previously required environment variables or a restart.

settings.mp4
  • Immediate Effect: Toggling Watchdogs, P2P, and Gallery availability applies instantly.
  • API Key Management: You can now generate, rotate, and expire API keys via the UI.
  • Network: CORS and CSRF settings are now accessible here (note: these specific network settings still require a restart to take effect).

Note: In order to benefit from persisting runtime settings, in older LocalAI deployments it's necessary to mount the /configuration directory from the container image.


⚙️ Advanced llama.cpp Configuration

For power users running large context windows or high-throughput setups, we've exposed additional underlying llama.cpp options in the YAML config. You can now tune context shifting, RAM limits for the KV cache, and parallel worker slots.

options:
- context_shift:false
- cache_ram:-1
- use_jinja:true
- parallel:2
- grpc_servers:localhost:50051,localhost:50052

📊 Logprobs & Logitbias Support

This release adds full support for logitbias and logprobs. This is critical for advanced agentic logic, Self-RAG, and evaluating model confidence / hallucination rates. It supports the OpenAI specification.


🛠️ Fixes & Improvements

OpenAI Compatibility:

  • SSE Streaming: Fixed a critical issue where streaming responses were slightly non-compliant (e.g., sending empty content chunks or missing finish_reason). This resolves integration issues with openai-node, LangChain, and LlamaIndex.
  • Top_N Behavior: In the reranker, top_n can now be omitted or set to 0 to return all results, rather than defaulting to an arbitrary limit.

General Fixes:

  • Model Preview: When downloading, you can now see the actual filename and size before committing to the download.
  • Tool Handling: Fixed crashes when tool content is missing or malformed.
  • TTS: Fixed dropdown selection states for TTS models.
  • Browser Storage: Chat history is now persisted in your browser's local storage. You can switch between parallel chats, rename them, and export them to JSON.
  • True Cancellation: Clicking "Stop" during a stream now correctly propagates a cancellation context to the backend (works for llama.cpp, vLLM, transformers, and diffusers). This immediately stops generation and frees up resources.

🚀 The Complete Local Stack for Privacy-First AI

LocalAI Logo

LocalAI

The free, Open Source OpenAI alternative. Drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required.

Link: https://github.com/mudler/LocalAI

LocalAGI Logo

LocalAGI

Local AI agent management platform. Drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI.

Link: https://github.com/mudler/LocalAGI

LocalRecall Logo

LocalRecall

RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Works alongside LocalAI and LocalAGI.

Link: https://github.com/mudler/LocalRecall


❤️ Thank You

Over 35,000 stars and growing. LocalAI is a true FOSS movement — built by contributors, powered by community.

If you believe in privacy-first AI:

  • Star the repo
  • 💬 Contribute code, docs, or feedback
  • 📣 Share with others

Your support keeps this stack alive.

✅ Full Changelog

📋 Click to expand full changelog

What's Changed

Bug fixes :bug:

Exciting New Features 🎉

🧠 Models

📖 Documentation and examples

👒 Dependencies

Other Changes

New Contributors

Full Changelog: mudler/LocalAI@v3.7.0...v3.8.0

View the full release notes at https://github.com/mudler/LocalAI/releases/tag/v3.8.0.


@github-actions github-actions bot added go Go use is a significant feature of the PR or issue bump-formula-pr PR was created using `brew bump-formula-pr` labels Nov 26, 2025
@botantony botantony added the CI-no-fail-fast Continue CI tests despite failing GitHub Actions matrix builds. label Nov 27, 2025
@daeho-ro daeho-ro added the test failure CI fails while running the test-do block label Nov 27, 2025
@chenrui333
Copy link
Member

  ==> Testing localai
  7:38PM INF Starting LocalAI using 4 threads, with models path: /private/tmp/localai-test-20251126-10467-8rhjzu/models
  7:38PM INF LocalAI version: 3.8.0 ()
  7:38PM INF Using metal capability (arm64 on mac), set LOCALAI_FORCE_META_BACKEND_CAPABILITY to override
  7:38PM INF Preloading models from /private/tmp/localai-test-20251126-10467-8rhjzu/models
  7:38PM INF core/startup process completed!
  ⇨ http server started on 127.0.0.1:49299
  ==> curl -s -i 127.0.0.1:49299
  7:38PM INF Using metal capability (arm64 on mac), set LOCALAI_FORCE_META_BACKEND_CAPABILITY to override
  7:38PM INF HTTP request method=GET path=/ status=200
  Killing child processes...
  Error: localai: failed
  An exception occurred within a child process:
    Minitest::Assertion: Expected /HTTP\/1\.1\ 200\ OK/ to match "HTTP/1.1 500 Internal Server Error\r\nContent-Type: application/json\r\nDate: Thu, 27 Nov 2025 00:38:14 GMT\r\nContent-Length: 84\r\n\r\n{\"error\":{\"code\":500,\"message\":\"json: unsupported type: func([]string)\",\"type\":\"\"}}\n".

localai: update test

Signed-off-by: Rui Chen <[email protected]>
@chenrui333 chenrui333 added ready to merge PR can be merged once CI is green and removed test failure CI fails while running the test-do block CI-no-fail-fast Continue CI tests despite failing GitHub Actions matrix builds. labels Nov 27, 2025
@github-actions
Copy link
Contributor

🤖 An automated task has requested bottles to be published to this PR.

Caution

Please do not push to this PR branch before the bottle commits have been pushed, as this results in a state that is difficult to recover from. If you need to resolve a merge conflict, please use a merge commit. Do not force-push to this PR branch.

@github-actions github-actions bot added the CI-published-bottle-commits The commits for the built bottles have been pushed to the PR branch. label Nov 27, 2025
@BrewTestBot BrewTestBot added this pull request to the merge queue Nov 27, 2025
Merged via the queue into main with commit 0d81a49 Nov 27, 2025
22 checks passed
@BrewTestBot BrewTestBot deleted the bump-localai-3.8.0 branch November 27, 2025 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bump-formula-pr PR was created using `brew bump-formula-pr` CI-published-bottle-commits The commits for the built bottles have been pushed to the PR branch. go Go use is a significant feature of the PR or issue ready to merge PR can be merged once CI is green

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants