10 Jul 12:29

KennethEnevoldsen

cfa27d7

1.38.34

1.38.34 (2025-07-10)

Fix

fix: pin datasets version (#2892)

fix datasets version (00c95cf)

Unknown

Update tasks & benchmarks tables (5303fec)
dataset: Evalita dataset integration (#2859)
Added DadoEvalCoarseClassification
Removed unnecessary columns from DadoEvalCoarseClassification
Added EmitClassification task
added SardiStanceClassification task
Added GeoLingItClassification task
Added DisCoTexPairClassification tasks
Added EmitClassification, DadoEvalCoarseClassification, GeoLingItClassification, SardiStanceClassification inside the inits
changed import in DisCoTexPairClassification
removed GeoLingItClassification dataset
fixed citation formatting, missing metadata parameters and lint formatting
- Added XGlueWRPReranking task

Added missing init.py files

fixed metadata in XGlueWRPReranking
Added MKQARetrieval task
fixed type in XGlueWRPReranking
changed MKQARetrieval from cross-lingual to monolingual
formatted MKQARetrieval file
removed unused const

Co-authored-by: Mattia Sangermano <[email protected]> (ee17a6e)

model: add Hakim and TookaSBERTV2 models (#2826)
add tooka v2s
add mcinext models
update mcinext.py
Apply PR review suggestions
Update mteb/models/mcinext_models.py

Co-authored-by: mehran <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]> (04dc6d4)

Update tasks & benchmarks tables (5be02c1)
Add and fix some Japanese datasets: ANLP datasets, JaCWIR, JQaRA (#2872)
Add JaCWIR and JQaRA for reranking
Fix ANLP Journal datasets
Add NLPJournalAbsArticleRetrieval and JaCWIRRetrieval
tackle test cases
Remove _evaluate_subset usage
Separate v1 and v2
Update info for NLP Journal datasets (70768b5)
Comment kalm model (#2877)

comment kalm model (a3ca95c)

model: add kalm_models ModelMeta (new PR) (#2853)
feat: add KaLM_Embedding_X_0605 in kalm_models
Update kalm_models.py for lint format

Co-authored-by: xinshuohu <[email protected]> (b67bd04)

model: add listconranker modelmeta (#2874)
add listconranker modelmeta
fix bugs
use linter
lint

Co-authored-by: Roman Solomatin <[email protected]> (5846f56)

fix tests to be compatible with SentenceTransformers v5 (#2875)
fix sbert v5
add comment (f346a37)
rename seed-1.6-embedding to seed1.6-embedding (#2870) (f27648b)
model: Adding nvidia/llama-nemoretriever-colembed models (#2861)
nvidia_llama_nemoretriever_colembed
correct 3b reference
lint fix
add training data and license for nvidia/llama_nemoretriever_colembed
lint

Co-authored-by: Isaac Chung <[email protected]> (4ff1413)

Bump gradio to fix leaderboard sorting (#2866)

Bump gradio (a4388c2)

model: Adding Sailesh97/Hinvec (#2842)
Adding Hinvec Model's Meta data.
Adding hinvec_model.py
Update mteb/models/hinvec_models.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

formated code with Black and lint with Ruff

Co-authored-by: Kenneth Enevoldsen <[email protected]> (e3286d5)

Assets 6

27 Jun 21:09

KennethEnevoldsen

1.38.33

9fed3e5

1.38.33

1.38.33 (2025-06-27)

Fix

fix: prompt validation for tasks with - (#2846)
fix prompt validation
fix task name split correctly
add docstring for test (430357c)

Unknown

add jinav4 model meta (#2858)
add model meta
linting
fix: add check for code lora
fix: apply review comments (f1d560a)

Assets 6

25 Jun 22:48

KennethEnevoldsen

1.38.32

82844eb

1.38.32

1.38.32 (2025-06-25)

Fix

fix: update training dataset info of Seed-1.6-embedding model (#2857)

update seed1.6 model training data info (a8214e2)

Assets 6

25 Jun 12:46

KennethEnevoldsen

1.38.31

642898f

1.38.31

1.38.31 (2025-06-25)

Documentation

docs: Fix some typos in docs/usage/usage.md (#2835)
Update usage.md
Update usage.md
Update docs/usage/usage.md

Co-authored-by: Isaac Chung <[email protected]> (774a942)

Fix

fix: Update model selection for the leaderboard (#2855)
fix: Update model selection for the leaderboard

fixes #2834

This removed the lower bound selection, but generally I don't think people should care about the models being too small.

fix 1M --> 1B
format
rename model_size -> max_model_size (9a800d3)

Unknown

model: add Seed-1.6-embedding model (#2841)
add Seed-1.6-embedding model
Update seed_1_6_embedding_models.py
update model meta info
support image encoder interface
error fix
fix: format seed_1_6_embedding_models.py with Ruff (8851bf0)
model: Add custom instructions for GigaEmbeddings (#2836)
add custom instructions
fixed
lint
fix last instruction

Co-authored-by: Kolodin Egor <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]> (d7ff1ab)

Assets 6

16 Jun 08:33

KennethEnevoldsen

1.38.30

dcdc16a

1.38.30

1.38.30 (2025-06-16)

Fix

fix: Reuploaded previously unavailable SNL datasets (#2819)
fix: Reuploaded previously unavailable SNL datasets

closes #2477

removed exceptions from tests
temp fixes
added temporary fix
clean up commented out code
format (c790269)

Unknown

Update tasks & benchmarks tables (74d17b2)
model: Added 3 HIT-TMG's KaLM-embedding models (#2478)
Added HIT-TMG_KaLM-embedding-multilingual-mini-instruct-v1 with instruct wrapper
Added KaLM_embedding_multilingual_mini_instruct_v1_5
Added model to overview.py
Fix Task Count Per Language Table in tasks.md
resolve conflicts
remove tasks.md
Modified get_instruction funcion
Added support for prompt dict in get_instruction
fix lang code
Address comments
Delete mteb/models/check_models.py
added prompts_dict support in InstructSentenceTransformerWrapper
corrected instruction format
corrected prompts format
added correct instruction format
fix implementation
remove if name main
add comment

Co-authored-by: Roman Solomatin <[email protected]> (03e084b)

add description to issue template (#2817)
add description to template
fix typo (04c9511)

Assets 6

11 Jun 19:20

KennethEnevoldsen

1.38.29

9fc0c3d

1.38.29

1.38.29 (2025-06-11)

Fix

fix: Ensure bright uses the correct revision (#2812)

fixes #2811 (56dc620)

fix: Adding client arg to init method of OpenAI models wrapper (#2803)
Adding OpenAI client arg to init method (e.g., for already initialized AzureOpenAI client)

To use OpenAI embedding models via Azure, the model wrapper needs to be initialized with a different client.

Update mteb/models/openai_models.py

Co-authored-by: Roman Solomatin <[email protected]>

Update mteb/models/openai_models.py
remove comment and format

Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]> (873ee76)

Unknown

model: Add annamodels/LGAI-Embedding-Preview (#2810)

Add LGAI-Embedding

Add mteb/models/lgai_embedding_models.py
defined model metadata (3e291f3)

Assets 6

10 Jun 21:44

KennethEnevoldsen

1.38.28

e6238f2

1.38.28

1.38.28 (2025-06-10)

Ci

ci: fix config error for semantic release (#2800)

discussed in: #2796 (3d8dd9e)

Fix

fix: Add adapted_from to Cmedqaretrieval (#2806)
fix: Add adapted_from to Cmedqaretrieval

Also snuck in a fix with form=None, which is no longer valid, but was still used in a few places.

format (fef1837)

Unknown

Update training datasets of GeoGPT-Research-Project/GeoEmbedding (#2802)

update training datasets

Co-authored-by: zhangzeqing <[email protected]> (36a3c67)

Update tasks & benchmarks tables (5e6aa9d)
dataset: Add R2MED Benchmark (#2795)
Add files via upload
Add files via upload
Update benchmarks.py
Update init.py
Add files via upload
Update R2MEDRetrieval.py
Update run_mteb_r2med.py
Delete scripts/run_mteb_r2med.py
Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

Add files via upload
Delete mteb/descriptive_stats/Retrieval/R2MEDRetrieval.json
Add files via upload
Add files via upload
Add files via upload
Update R2MEDRetrieval.py
Add files via upload
Add files via upload
Add files via upload
Add files via upload
format citations
Update R2MEDRetrieval.py
Add files via upload
Add files via upload

Co-authored-by: Li Lei <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]> (b8e64e1)

model: add fangxq/XYZ-embedding (#2741)
add xyz model
add xyz model
add xyz model
update
update
update
update
update
update
update
lint

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]> (1c08974)

model: Add GeoGPT-Research-Project/GeoEmbedding (#2773)
add model: geogpt_models
update geogpt_models
use InstructSentenceTransformerWrapper
resolve pylint warning
format geogpt_models.py
Update mteb/models/geogpt_models.py

Co-authored-by: Roman Solomatin <[email protected]>

Update mteb/models/geogpt_models.py

Co-authored-by: zhangzeqing <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]> (8817670)

Update issue and pr templates (#2782)
Update issue templates
Update bug_report.md
test yaml template
add templates
update templates
add emojis
fix typo
Apply suggestions from code review

Co-authored-by: Kenneth Enevoldsen <[email protected]>

update issue titles
update PR template
remove PR templates

Co-authored-by: Kenneth Enevoldsen <[email protected]> (af7adbf)

bump ruff (#2784) (9e2e972)
model: Add Qwen3 Embedding model (#2769)
Init code
Remove extra config and lint code
use sentence transformer
add revisions
fix lint
Apply suggestions from code review

Co-authored-by: Roman Solomatin <[email protected]>

fix lint
add framework

Co-authored-by: Roman Solomatin <[email protected]> (fe137d0)

Update tasks & benchmarks tables (360bf51)
dataset: Add miracl vision (#2736)
add miracl vision
add miracl vision
ruff
cast
image
image
add langs
add langs
add langs
add langs
descriptive stats
lint
lint
lint
remove com (61dc369)

Assets 6

05 Jun 16:38

KennethEnevoldsen

1.38.27

793b888

1.38.27

1.38.27 (2025-06-05)

Fix

fix: CachedEmbeddingWrapper issues in both documentation and code (#2779)

Fixes #2772 (f7656d5)

Assets 6

05 Jun 16:09

KennethEnevoldsen

1.38.26

513f730

1.38.26

1.38.26 (2025-06-05)

Fix

fix: Update Caltech101 datasets to latest revision [v1] (#2778)
fix: Update Caltech101 datasets to latest revision [v2]

fixes: #2770
Fixes the issue, but only in v1

# tested using:

task: mteb.AbsTask = mteb.get_task(&#34;Caltech101ZeroShot&#34;)
task.load_data()
task.get_candidate_labels()

fix rev (40f0841)

Assets 6

05 Jun 15:30

KennethEnevoldsen

1.38.25

cd0b490

1.38.25

1.38.25 (2025-06-05)

Ci

ci: add new prefixes to releases (#2766)

add new prefixes (755a6eb)

Fix

fix: Update giga embeddings (#2774)
update giga embeddings
update giga embeddings

Co-authored-by: Kolodin Egor <[email protected]> (5b71e34)

Assets 6

Releases: embeddings-benchmark/mteb

1.38.34

1.38.34 (2025-07-10)

Fix

Unknown

Uh oh!

1.38.33

1.38.33 (2025-06-27)

Fix

Unknown

Uh oh!

1.38.32

1.38.32 (2025-06-25)

Fix

Uh oh!

1.38.31

1.38.31 (2025-06-25)

Documentation

Fix

Unknown

Uh oh!

1.38.30

1.38.30 (2025-06-16)

Fix

Unknown

Uh oh!

1.38.29

1.38.29 (2025-06-11)

Fix

Unknown

Uh oh!

1.38.28

1.38.28 (2025-06-10)

Ci

Fix

Unknown

Uh oh!

1.38.27

1.38.27 (2025-06-05)

Fix

Uh oh!

1.38.26

1.38.26 (2025-06-05)

Fix

Uh oh!

1.38.25

1.38.25 (2025-06-05)

Ci

Fix

Uh oh!