Releases · BerriAI/litellm

15 Aug 06:02

github-actions

v1.75.7-nightly

d29bc42

v1.75.7-nightly Latest

Latest

What's Changed

[Proxy] LiteLLM mock test fix by @jugaldb in #13635
[Proxy] Litellm add DB metrics to prometheus by @jugaldb in #13626
[LLM Translation] Fix Realtime API endpoint for no intent by @jugaldb in #13476
[MCP Gateway] LiteLLM Fix MCP gateway key auth by @jugaldb in #13630
[Fix] Ensure /messages works when using `bedrock/converse/ with LiteLLM by @ishaan-jaff in #13627
UI - Fix image overflow in LiteLLM model by @ishaan-jaff in #13639
[Bug Fix] /messages endpoint - ensure tool use arguments are returned for non-anthropic models by @ishaan-jaff in #13638

Full Changelog: v1.75.6-nightly...v1.75.7-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.75.7-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	140.0	182.49004836898376	6.247441314306712	0.0	1870	0	114.26430999995318	2038.3160259999613
Aggregated	Passed ✅	140.0	182.49004836898376	6.247441314306712	0.0	1870	0	114.26430999995318	2038.3160259999613

Contributors

ishaan-jaff and jugaldb

Assets 4

14 Aug 23:48

github-actions

v1.75.6-nightly

025ce17

v1.75.6-nightly

What's Changed

[Bug Fix] - Allow using reasoning_effort for gpt-5 model family and reasoning for Responses API by @ishaan-jaff in #13475
[Bug Fix]: Azure OpenAI GPT-5 max_tokens + reasoning param support by @ishaan-jaff in #13510
[Draft] [LLM Translation] Add model id check by @jugaldb in #13507
[Docs] - Document how to sending tags with Litellm Python SDK Calls to LiteLM Proxy by @ishaan-jaff in #13517
Fix OCI streaming by @breno-aumo in #13437
feat: add CometAPI provider support with chat completions and streaming by @TensorNull in #13458
Allow unsetting TPM and RPM - Teams Settings by @NANDINI-star in #13430
[Feat] - Add key/team logging for Langfuse OTEL Logger by @ishaan-jaff in #13512
[Feat] Add Streaming support + Docs for bedrock gpt-oss model family by @ishaan-jaff in #13346
[Feat] GEMINI CLI Integration - Add /countTokens endpoint support by @ishaan-jaff in #13545
Feat/sambanova embeddings by @jhpiedrahitao in #13308
Display Error from Backend on the UI - Keys Page by @NANDINI-star in #13435
Team Member Permissions Page - Access Column Changes by @NANDINI-star in #13145
Fix internal users table overflow by @NANDINI-star in #12736
Enhance chart readability with short-form notation for large numbers by @NANDINI-star in #12370
[Bug fix] SCIM Team Memberships - handle metadata by @ishaan-jaff in #13553
[Feat] GEMINI CLI - Add Token Counter for VertexAI Models by @ishaan-jaff in #13558
[Feat] Add CredentialDeleteModal component and integrate with CredentialsPanel by @jugaldb in #13550
Implement GitHub Action to auto-label issues with provider keywords by @kankute-sameer in #13537
LiteLLM SDK <-> Proxy: support user param + Prisma - remove use_prisma_migrate flag - redundant as this is now default by @krrishdholakia in #13555
[Fix] Streaming - consistent 'finish_reason' chunk index by @krrishdholakia in #13560
[Fix] Hide sensitive data in /model/info - azure entra client_secret by @MajorD00m in #13577
Fix Ollama GPT-OSS streaming with 'thinking' field by @colesmcintosh in #13375
fix(azure): remove trailing semicolon in Content-Type header for image generation by @VerunicaM in #13584
Remove ambiguous network response error by @NANDINI-star in #13582
[fix] Enhance MCPServerManager with access groups and description support by @jugaldb in #13549
[Feat] New model vertex_ai/deepseek-ai/deepseek-r1-0528-maas by @ishaan-jaff in #13594
[Docs] Update build from pip docs - new prisma migrate by @ishaan-jaff in #13603
[Feat] New provider - Azure AI Flux Image Generation by @ishaan-jaff in #13592
[Feat] Team Member Rate Limits + Support for using with JWT Auth by @ishaan-jaff in #13601
Fix e2e_ui_testing by @NANDINI-star in #13610
fix(volcengine): handle thinking disabled parameter properly by @colesmcintosh in #13598
[Feat] Add reasoning_effort param for hosted_vllm provider by @ishaan-jaff in #13620
perf(main.py): new 'EXPERIMENTAL_OPENAI_BASE_LLM_HTTP_HANDLER' flag (+100 rps improvement on openai calls) by @krrishdholakia in #13625
Add deepseek-chat-v3-0324 to OpenRouter cost map by @huangyafei in #13607
[LLM Translation/Proxy] Fix - add safe divide by 0 for most places to prevent crash by @jugaldb in #13624
[LLM translation] Refactor Anthropic Configurations and Add Support for anthropic_beta Headers by @jugaldb in #13590
[Management/UI]Allow routes for admin viewer by @jugaldb in #13588
[Proxy] Litellm fix mapped tests by @jugaldb in #13634
Update mlflow logger usage span attributes by @TomeHirata in #13561

New Contributors

@TensorNull made their first contribution in #13458
@MajorD00m made their first contribution in #13577
@VerunicaM made their first contribution in #13584
@huangyafei made their first contribution in #13607
@TomeHirata made their first contribution in #13561

Full Changelog: v1.75.5.rc.1...v1.75.6-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.75.6-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	110.0	152.45377470174589	6.50695562099742	0.0	1948	0	86.13810599996441	2202.5806519999946
Aggregated	Passed ✅	110.0	152.45377470174589	6.50695562099742	0.0	1948	0	86.13810599996441	2202.5806519999946

Contributors

huangyafei, jhpiedrahitao, and 11 other contributors

Assets 4

13 Aug 00:59

github-actions

litellm_v1.75.5-dev_memory_fix_2

df83394

litellm_v1.75.5-dev_memory_fix_2

Full Changelog: litellm_v1.75.5-dev_memory_fix...litellm_v1.75.5-dev_memory_fix_2

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-litellm_v1.75.5-dev_memory_fix_2

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	110.0	148.3727584527906	6.403502625001942	0.0	1917	0	82.2168479999732	1009.9184229999878
Aggregated	Passed ✅	110.0	148.3727584527906	6.403502625001942	0.0	1917	0	82.2168479999732	1009.9184229999878

Assets 4

14 Aug 00:09

github-actions

litellm_v1.73.0-dev_memory_fix_2

465dd6f

litellm_v1.73.0-dev_memory_fix_2

Full Changelog: litellm_v1.73.0-dev_memory_fix_1...litellm_v1.73.0-dev_memory_fix_2

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-litellm_v1.73.0-dev_memory_fix_2

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	160.0	197.89495319271438	6.333208452490925	0.0	1894	0	120.76194100001203	1698.8860899999736
Aggregated	Passed ✅	160.0	197.89495319271438	6.333208452490925	0.0	1894	0	120.76194100001203	1698.8860899999736

Assets 4

13 Aug 22:49

github-actions

litellm_v1.73.0-dev_memory_fix_1

fd7808e

litellm_v1.73.0-dev_memory_fix_1

Full Changelog: litellm_v1.73.0-dev_memory_fix...litellm_v1.73.0-dev_memory_fix_1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-litellm_v1.73.0-dev_memory_fix_1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	130.0	168.52018052761707	6.353625704455274	0.0	1901	0	102.81331199996657	955.3499159999888
Aggregated	Passed ✅	130.0	168.52018052761707	6.353625704455274	0.0	1901	0	102.81331199996657	955.3499159999888

Assets 4

13 Aug 05:01

github-actions

litellm_v1.73.0-dev_memory_fix

758fe45

litellm_v1.73.0-dev_memory_fix

Full Changelog: v1.73.0.rc.1...litellm_v1.73.0-dev_memory_fix

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-litellm_v1.73.0-dev_memory_fix

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	170.0	201.69670727064485	6.271320946816614	0.0	1877	0	124.88003300001083	1273.2718879999823
Aggregated	Passed ✅	170.0	201.69670727064485	6.271320946816614	0.0	1877	0	124.88003300001083	1273.2718879999823

Assets 4

12 Aug 23:07

github-actions

v1.75.5.dev3

bd5797f

v1.75.5.dev3

What's Changed

[Bug Fix] - Allow using reasoning_effort for gpt-5 model family and reasoning for Responses API by @ishaan-jaff in #13475
[Bug Fix]: Azure OpenAI GPT-5 max_tokens + reasoning param support by @ishaan-jaff in #13510
[Draft] [LLM Translation] Add model id check by @jugaldb in #13507
[Docs] - Document how to sending tags with Litellm Python SDK Calls to LiteLM Proxy by @ishaan-jaff in #13517
Fix OCI streaming by @breno-aumo in #13437
feat: add CometAPI provider support with chat completions and streaming by @TensorNull in #13458
Allow unsetting TPM and RPM - Teams Settings by @NANDINI-star in #13430
[Feat] - Add key/team logging for Langfuse OTEL Logger by @ishaan-jaff in #13512
[Feat] Add Streaming support + Docs for bedrock gpt-oss model family by @ishaan-jaff in #13346

New Contributors

@TensorNull made their first contribution in #13458

Full Changelog: v1.75.5.rc.1...v1.75.5.dev3

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.75.5.dev3

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	170.0	209.8803997534473	6.313294138003075	0.0	1886	0	126.72262299997783	1268.0979020000223
Aggregated	Passed ✅	170.0	209.8803997534473	6.313294138003075	0.0	1886	0	126.72262299997783	1268.0979020000223

Contributors

ishaan-jaff, jugaldb, and 3 other contributors

Assets 4

12 Aug 22:25

github-actions

litellm_v1.75.5-dev_memory_fix_1

460ef37

litellm_v1.75.5-dev_memory_fix_1

What's Changed

[Bug Fix] - Allow using reasoning_effort for gpt-5 model family and reasoning for Responses API by @ishaan-jaff in #13475
[Bug Fix]: Azure OpenAI GPT-5 max_tokens + reasoning param support by @ishaan-jaff in #13510
[Draft] [LLM Translation] Add model id check by @jugaldb in #13507

Full Changelog: v1.75.5.rc.1...litellm_v1.75.5-dev_memory_fix_1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-litellm_v1.75.5-dev_memory_fix_1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	180.0	211.1796445664915	6.382692480357281	0.0	1910	0	132.33785100001683	1892.3347159999935
Aggregated	Passed ✅	180.0	211.1796445664915	6.382692480357281	0.0	1910	0	132.33785100001683	1892.3347159999935

Contributors

ishaan-jaff and jugaldb

Assets 4

12 Aug 01:51

github-actions

litellm_v1.75.5-dev_memory_fix

460ef37

litellm_v1.75.5-dev_memory_fix

What's Changed

[Bug Fix] - Allow using reasoning_effort for gpt-5 model family and reasoning for Responses API by @ishaan-jaff in #13475
[Bug Fix]: Azure OpenAI GPT-5 max_tokens + reasoning param support by @ishaan-jaff in #13510
[Draft] [LLM Translation] Add model id check by @jugaldb in #13507

Full Changelog: v1.75.5.rc.1...litellm_v1.75.5-dev_memory_fix

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-litellm_v1.75.5-dev_memory_fix

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-litellm_v1.75.5-dev_memory_fix

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	120.0	153.71541160805808	6.384305831136473	0.0	1911	0	80.34984599999007	1251.9617030000063
Aggregated	Passed ✅	120.0	153.71541160805808	6.384305831136473	0.0	1911	0	80.34984599999007	1251.9617030000063

Contributors

ishaan-jaff and jugaldb

Assets 4

10 Aug 17:18

github-actions

v1.75.5.rc.1

0aeb4f1

v1.75.5.rc.1

What's Changed

Litellm release notes 08 10 2025 by @krrishdholakia in #13479
Litellm model cost map fixes by @krrishdholakia in #13480
ui - build updated ui + increase max_tokens in health_check = 10 (gpt-5-nano throws error for max_token=1) by @krrishdholakia in #13482

Full Changelog: v1.75.5-stable.rc-draft...v1.75.5.rc.1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.75.5.rc.1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	140.0	177.87900723772194	6.396151577527124	0.0	1914	0	110.98393499997883	1386.0049980000042
Aggregated	Passed ✅	140.0	177.87900723772194	6.396151577527124	0.0	1914	0	110.98393499997883	1386.0049980000042

Contributors

krrishdholakia

Assets 4

Uh oh!

Releases: BerriAI/litellm

v1.75.7-nightly

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!

v1.75.6-nightly

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!

litellm_v1.75.5-dev_memory_fix_2

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Uh oh!

litellm_v1.73.0-dev_memory_fix_2

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Uh oh!

litellm_v1.73.0-dev_memory_fix_1

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Uh oh!

litellm_v1.73.0-dev_memory_fix

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Uh oh!

v1.75.5.dev3

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!

litellm_v1.75.5-dev_memory_fix_1

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!

litellm_v1.75.5-dev_memory_fix

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!

v1.75.5.rc.1

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!