Skip to content

Conversation

fangxiaoquan
Copy link
Contributor

@fangxiaoquan fangxiaoquan commented May 30, 2025

Checklist

  • I did not add a dataset, or if I did, I added the dataset checklist to the PR and completed it.
  • I did not add a model, or if I did, I added the model checklist to the PR and completed it.

Adding a model checklist

  • I have filled out the ModelMeta object to the extent possible
  • I have ensured that my model can be loaded using
  • mteb.get_model(model_name, revision) and
  • mteb.get_model_meta(model_name, revision)
  • I have tested the implementation works on a representative set of tasks.
  • The model is public, i.e. is available either as an API or the wieght are publicly avaiable to download

@fangxiaoquan fangxiaoquan requested a review from Samoed June 4, 2025 06:06
@Samoed Samoed requested a review from KennethEnevoldsen June 5, 2025 08:46
@KennethEnevoldsen KennethEnevoldsen changed the title add xyz model fix: add new model: fangxq/XYZ-embedding Jun 5, 2025
@KennethEnevoldsen KennethEnevoldsen added the new model Questions related to adding a new model to the benchmark label Jun 5, 2025
@KennethEnevoldsen
Copy link
Contributor

All good. I will just need you to complete the model checklist

@KennethEnevoldsen
Copy link
Contributor

@fangxiaoquan, you will have to add the checklist to the PR comment

@Samoed Samoed changed the title fix: add new model: fangxq/XYZ-embedding model: fangxq/XYZ-embedding Jun 5, 2025
@fangxiaoquan
Copy link
Contributor Author

@fangxiaoquan, you will have to add the checklist to the PR comment

Adding a model checklist

  • I have filled out the ModelMeta object to the extent possible
  • I have ensured that my model can be loaded using
    • mteb.get_model(model_name, revision) and
    • mteb.get_model_meta(model_name, revision)
  • I have tested the implementation works on a representative set of tasks.
  • The model is public, i.e. is available either as an API or the wieght are publicly avaiable to download

@KennethEnevoldsen KennethEnevoldsen changed the title model: fangxq/XYZ-embedding model: add fangxq/XYZ-embedding Jun 9, 2025
@KennethEnevoldsen KennethEnevoldsen enabled auto-merge (squash) June 9, 2025 07:32
@KennethEnevoldsen KennethEnevoldsen merged commit 1c08974 into embeddings-benchmark:main Jun 9, 2025
10 checks passed
isaac-chung added a commit that referenced this pull request Jun 22, 2025
* move icon & name to benchmark dataclass (#2573)

* Remove the comments from ImageEncoder (#2579)

* fix: Add Encodechka benchmark (#2561)

* add tasks

* add benchmark

* fix imports

* update stsb split

* Update tasks table

* 1.38.2

Automatically generated by python-semantic-release

* fix FlagEmbedding package name (#2588)

* fix codecarbon version (#2587)

* Add MIEB image only benchmark (#2590)

* add vision only bench

* add description

* correct zs task modalities

* specify tasks param

* Add image only MIEB benchmark to LB left panel (#2596)

* Update benchmarks.py

* make lint

* add to left side bar

* update Doubao-1.5-Embedding (#2575)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* fix: Add WebSSL models (#2604)

* add 2 web SSL dino models

* add models from collection and revisions

* update memory_usage_mb and embed dim

* use automodel instead

* fix mieb citation (#2606)

* 1.38.3

Automatically generated by python-semantic-release

* Update Doubao-1.5-Embedding (#2611)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* CI: update benchmark table (#2609)

* update benchmark table

* fix table

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update Doubao-1.5-Embedding revision (#2613)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

* update revision

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* CI: fix table  (#2615)

* Update tasks & benchmarks tables

* Update gradio version (#2558)

* Update gradio version

Closes #2557

* bump gradio

* fix: Removed missing dataset for MTEB(Multilingual) and bumped version

We should probably just have done this earlier to ensure that the multilingual benchamrk is runable.

* CI: fix infinitely committing issue (#2616)

* fix token

* try to trigger

* add token

* test ci

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* remove test lines

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Add ScandiSent dataset (#2620)

* add scandisent dataset

* add to init

* typo

* lint

* 1.38.4

Automatically generated by python-semantic-release

* Format all citations (#2614)

* Fix errors in bibtex_citation

* Format all bibtex_citation fields

* format benchmarks

* fix format

* Fix tests

* add formatting script

* fix citations (#2628)

* Add Talemaader pair classification task (#2621)

Add talemaader pair classification task

* add Bilingual English-Danish parallel corpus from The Danish Medicines Agency (#2633)

* add Bilingual English-Danish parallel corpus from The Danish Medicines Agency

* bump dataset revision

* format bibtex

* format bibtex

* Remove irrelevant test (#2630)

remove irrelevant test

* Revert "CI: fix infinitely committing issue (#2616)" (#2636)

This reverts commit 82dcb3d.

* Update tasks & benchmarks tables

* Remove `typer` dependency from citation script (#2629)

remove typer dependency from citation script

* CI format citations (#2649)

* ci format citations

* add files

* remove from lint CI

* test lint

* test lint

* fix names

* fix: Update VisualSTS Aggregate task modalities (#2597)

* Update STS17MultilingualVisualSTS.py

* fix STSBenchmarkMultilingualVisualSTS

---------

Co-authored-by: Isaac Chung <[email protected]>

* 1.38.5

Automatically generated by python-semantic-release

* Add tests for leaderboard build (#2631)

* Add tests for leaderboard build

* add new action

* remove build tests from other actions

* fix tests

* correct exclusion of test

* added timeout constant

* fix: SIB200 machine translated > human translated (#2665)

As correctly pointed out in:

https://huggingface.co/datasets/mteb/sib200/discussions/1

* 1.38.6

Automatically generated by python-semantic-release

* fix: Update datasets wich can't be loaded with `datasets>=3.0`  (#2661)

fix: Update datasets wich can't be loaded with `datasets>=3.0` (#1619)

* reupload datasets

* fix loader

* remove commented code

* lint

* update pyproject dependencies

* rename model RELLE to CHAIN19 (#2671)

* Add relle
* defined model metadata for relle

* Add mteb/models/relle_models.py

* Update mteb/models/relle_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* lint after commit

run after "make lint"

* Add into model_modules

Add model into model_modules and lint check

* rename model
change model name

* rename model
change model name

---------

Co-authored-by: Roman Solomatin <[email protected]>

* 1.38.7

Automatically generated by python-semantic-release

* Update final version of Doubao-1.5-Embedding (Rename to Seed1.5-Embedding) (#2674)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

* update revision

* update Doubao-1.5-Embedding revision 3

* rename Doubao-1.5-Embedding to Seed1.5-Embedding

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* fix: Allow empty string for openai models (#2676)

* fix for empty string input to openai/text-embedding-3-large

* fix: Allow empty string in openai models

closes: #1650

* fix based on review

* Updated docstring

---------

Co-authored-by: ayush1298 <[email protected]>

* 1.38.8

Automatically generated by python-semantic-release

* Leaderboard: UI simplifications for menus (#2672)

* Leaderboard: UI simplifications for menus

Did a few things to improve the simplify the leaderboard UI.

Changes:
- Combined FAQ entries
- Created dropdowns in the select benchmark menu sidebar
- Removed reference to arena
- Removed reference to old leaderboard
- reduced size of select menu
- reduced the size of acknowledgements
- removed farsi from the selection (as it is a beta)

refactors:
- refactored to use a class for menu items
- refactored texts segments out of app.py

* fixed comment

* fixes for sizes

* fix modality for `OVENIT2TRetrieval` (#2678)

fix modality

* fix: `MTEB(Code, v1)`  languages (#2679)

fix code languages

* 1.38.9

Automatically generated by python-semantic-release

* Correction in docs (#2688)

* Fix for Openai_Text-Embedding3-Small (#2702)

* Fix for Openai_Text-Embedding3-Small

* better syntax for readability

* Fix for Openai_Text-Embedding3-Small (#2702)

* Fix for Openai_Text-Embedding3-Small

* better syntax for readability

* fix: Ensure that optional dependencies are compatible and if not state it (#2706)

Fixes mistakes introduced in #2424

It seems like many of these requirements doesn't exist (voyageai>=1.0.0). @ayush1298 I am hoping you could clear up how this happened?

* fix: Only install mteb into site packages (#2618)

* Restrict installation directory

* fix

* namespace false

* add star

* add pont

* fix import

* fix import

* add init files

* fix setuptools find

* fix image init

* add missing templates

---------

Co-authored-by: Roman Solomatin <[email protected]>

* 1.38.10

Automatically generated by python-semantic-release

* docs: Updated the PR template and improved submission docs (#2704)

* docs: Updated the PR template and improved submission docs

1) Updated PR template to only include checklist for datasets and models. The other checklists were essentially just tests.
2) I have updated the documentation for adding models. Notably I have split out the implementation segment, which I think makes it more readable.
3) Required that you argue for a dataset before addition

fixes #2568

* Apply suggestions from code review

Co-authored-by: Isaac Chung <[email protected]>

---------

Co-authored-by: Isaac Chung <[email protected]>

* fix: Remove models from the leaderboard (#2705)

* fix: Remove models from the leaderboard

I remove both models from the leaderboard by unlinking them from the import tree. I think this is the easiest way to add a model that not currently public.

* format

* 1.38.11

Automatically generated by python-semantic-release

* fix: Rename gemini-embedding-exp-03-07 to gemini-embedding-001 (#2711)

* Rename gemini-embedding-exp-03-07 to gemini-embedding-001

* update referenfe link to the vertexAI API doc

* 1.38.12

Automatically generated by python-semantic-release

* fix: Integrate `lightonai/GTE-ModernColBERT-v1` (#2708)

* fix: Integrate `lightonai/GTE-ModernColBERT-v1`

Fixes #2673

* fixes based on corrections

* 1.38.13

Automatically generated by python-semantic-release

* docs: fix number of tasks for eng, v2 in docs (#2720)

* fix: Added potion-multilingual-128M (#2717)

* Added ModelMeta for potion-multilingual-128M

* Fixed linting

* Fixed linting

* Updated date

* 1.38.14

Automatically generated by python-semantic-release

* Update the max tokens for gemini-embedding-001 (#2725)

* fix: Ara and ben classification dataset cleaning (#2632)

* Improve classification datasets quality for ara and ben langs

* add missing AJGT

* fix format

* change ajgt description

* Fix numbers in description, add link to pull request

* Add too short filter

* Link in markdown format

* Update tasks & benchmarks tables

* fix: Update Seed1.5-Embedding API (#2724)

* update seed1.5-embedding api

* update seed1.5-embedding api

* update Seed1.5-Embedding API

* update Seed1.5-Embedding resolve comments

* update Seed1.5-Embedding lint

* Update mteb/models/seed_models.py

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* 1.38.15

Automatically generated by python-semantic-release

* fix: Add vidore v2 benchmarks (#2713)

* adding vidore benchmarks

* fix typo

* clean vidore names + per lang eval

* lint

* vidore names

* bibtex fix

* fix revision

* vidore v2 citation

* update citation format and fix per-language mappings

* lint: citations

* typo citations

* Update tasks & benchmarks tables

* 1.38.16

Automatically generated by python-semantic-release

* fix: `IndicQARetrieval` loader (#2729)

* fix indic qa

* add kwargs

* 1.38.17

Automatically generated by python-semantic-release

* fix: Promote Persian benchmark to v1 (#2707)

* Switch versioning from beta to v1 and add v1 to benchmark selector

* Update Farsi benchmark display name, task IDs, and metadata

* Add Hakim Model

* fix hakim version

* update

* make lint

* fix: Promote Persian benchmark to v1

---------

Co-authored-by: mehran <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Update tasks & benchmarks tables

* 1.38.18

Automatically generated by python-semantic-release

* Add ViDoRe combined benchmark and add to leaderboard side panel (#2732)

* add ViDoRe combined benchmark and add to leaderboard side panel

* Update benchmark_selector.py

* Update tasks & benchmarks tables

* fix: Rename display name of VDR (#2734)

* Update tasks & benchmarks tables

* 1.38.19

Automatically generated by python-semantic-release

* fix: Add colpali models family (#2721)

* add colpali models

* add colpali as framework

* add colpali as framework

* update metadata and add colsmol

* ix typos

* account for revision

* add training data info and lint

* modify meta

* correct colmodels meta and add colnomic 7b

* fix typo in toml (colpali subdeps)

* refine colmodel loading and metadata

* 1.38.20

Automatically generated by python-semantic-release

* fix: Correct embedding dimension for bge-m3 (#2738)

Fixes #2735

* 1.38.21

Automatically generated by python-semantic-release

* docs: Updated description of FEVER (#2745)

* docs: Updated description of FEVER

Update the description to state that the corpus is the same as fever as we have have [multiple questions on it](https://huggingface.co/datasets/mteb/climate-fever/discussions/2)

* minor

* Backfill task metadata for metadata for BigPatentClustering and AllegroReviews (#2755)

* big-patent

* allegro-reviews

* Update tasks & benchmarks tables

* Update Seed1.5 training data (#2749)

* update seed1.5 training data

* update seed1.5 training data

* fix: Update caltech101 (#2759)

* docs: Updated description of FEVER

Update the description to state that the corpus is the same as fever as we have have [multiple questions on it](https://huggingface.co/datasets/mteb/climate-fever/discussions/2)

* fix: Update Caltech101 to different source

Run both versions of one of the task using `nomic-ai/nomic-embed-text-v1.5` and both scores match:

### Old

```
{
  "dataset_revision": "851374102055782c84f89b1b4e9d128a6568847b",
  "task_name": "Caltech101",
  "mteb_version": "1.38.4",
  "scores": {
    "test": [
      {
        "accuracy": 0.897863,
```

### New
```
{
  "dataset_revision": "52439cf6d4f6ebf563d8cdc7f2c5371d9efd2686",
  "task_name": "Caltech101",
  "mteb_version": "1.38.4",
  "scores": {
    "test": [
      {
        "accuracy": 0.897929,
```

* 1.38.22

Automatically generated by python-semantic-release

* Add missing PatchCamelyon_labels.txt (#2756)

* ci: Delete cache in Model loading test only when model is loaded (#2761)

* only delete cache when model loaded

* testing it out

* fix: Add `cadet-embed-base-v1` (#2727)

* update

* update overview.py for models

* update

* update

* 1.38.23

Automatically generated by python-semantic-release

* Fixing Google embedding task type for STS (#2767)

The type `SIMILARITY` is invalid. Correct one: `SEMANTIC_SIMILARITY`. See https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/task-types#supported_task_types

* docs: Leaderboard simplifications (#2764)

* docs: Leaderboard simplifications

Simplified sidebar, notably:

1) Combined Language and Regional (since these are all languages)
2) Folded all (With Visual document retrieval then images start to take up a lot of space)
3) Removed legacy and instead added "Other" in language, where I moved "English Legacy"

I also restructured the code so that nesting is easier.

Is it also possible to create a seperate section (see dummy screenshot)

* refactor to reduce nesting

* format

* fix: add xet support (#2603)

* add xet version

* add doc comment

* change xet requirements

* Update docs/usage/usage.md

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* 1.38.24

Automatically generated by python-semantic-release

* fix: Update giga embeddings (#2774)

* update giga embeddings

* update giga embeddings

---------

Co-authored-by: Kolodin Egor <[email protected]>

* ci: add new prefixes to releases (#2766)

add new prefixes

* 1.38.25

Automatically generated by python-semantic-release

* fix: Update Caltech101 datasets to latest revision [v1] (#2778)

* fix: Update Caltech101 datasets to latest revision [v2]

 fixes: #2770
Fixes the issue, but only in v1

```
# tested using:

task: mteb.AbsTask = mteb.get_task("Caltech101ZeroShot")
task.load_data()
task.get_candidate_labels()
```

* fix rev

* 1.38.26

Automatically generated by python-semantic-release

* fix: CachedEmbeddingWrapper issues in both documentation and code (#2779)

Fixes #2772

* 1.38.27

Automatically generated by python-semantic-release

* dataset: Add miracl vision (#2736)

* add miracl vision

* add miracl vision

* ruff

* cast

* image

* image

* add langs

* add langs

* add langs

* add langs

* descriptive stats

* lint

* lint

* lint

* remove com

* Update tasks & benchmarks tables

* model: Add Qwen3 Embedding model (#2769)

* Init code

* Remove extra config and lint code

* use sentence transformer

* add revisions

* fix lint

* Apply suggestions from code review

Co-authored-by: Roman Solomatin <[email protected]>

* fix lint

* add framework

---------

Co-authored-by: Roman Solomatin <[email protected]>

* bump ruff (#2784)

* Update issue and pr templates (#2782)

* Update issue templates

* Update bug_report.md

* test yaml template

* add templates

* update templates

* add emojis

* fix typo

* Apply suggestions from code review

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* update issue titles

* update PR template

* remove PR templates

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* model: Add GeoGPT-Research-Project/GeoEmbedding (#2773)

* add model: geogpt_models

* update geogpt_models

* use InstructSentenceTransformerWrapper

* resolve pylint warning

* format geogpt_models.py

* Update mteb/models/geogpt_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/models/geogpt_models.py

---------

Co-authored-by: zhangzeqing <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* model: add fangxq/XYZ-embedding (#2741)

* add xyz model

* add xyz model

* add xyz model

* update

* update

* update

* update

* update

* update

* update

* lint

---------

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* ci: fix config error for semantic release (#2800)

discussed in: #2796

* dataset: Add R2MED Benchmark (#2795)

* Add files via upload

* Add files via upload

* Update benchmarks.py

* Update __init__.py

* Add files via upload

* Update R2MEDRetrieval.py

* Update run_mteb_r2med.py

* Delete scripts/run_mteb_r2med.py

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Add files via upload

* Delete mteb/descriptive_stats/Retrieval/R2MEDRetrieval.json

* Add files via upload

* Add files via upload

* Add files via upload

* Update R2MEDRetrieval.py

* Add files via upload

* Add files via upload

* Add files via upload

* Add files via upload

* format citations

* Update R2MEDRetrieval.py

* Add files via upload

* Add files via upload

---------

Co-authored-by: Li Lei <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* Update tasks & benchmarks tables

* Update training datasets of GeoGPT-Research-Project/GeoEmbedding (#2802)

update training datasets

Co-authored-by: zhangzeqing <[email protected]>

* fix: Add adapted_from to Cmedqaretrieval (#2806)

* fix: Add adapted_from to Cmedqaretrieval

Also snuck in a fix with form=None, which is no longer valid, but was still used in a few places.

* format

* 1.38.28

Automatically generated by python-semantic-release

* fix: Adding client arg to init method of OpenAI models wrapper (#2803)

* Adding OpenAI client arg to init method (e.g., for already initialized AzureOpenAI client)

To use OpenAI embedding models via Azure, the model wrapper needs to be initialized with a different client.

* Update mteb/models/openai_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/models/openai_models.py

* remove comment and format

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* model: Add annamodels/LGAI-Embedding-Preview (#2810)

Add LGAI-Embedding

- Add mteb/models/lgai_embedding_models.py

- defined model metadata

* fix: Ensure bright uses the correct revision (#2812)

fixes #2811

* 1.38.29

Automatically generated by python-semantic-release

* add description to issue template (#2817)

* add description to template

* fix typo

* model: Added 3 HIT-TMG's KaLM-embedding models (#2478)

* Added HIT-TMG_KaLM-embedding-multilingual-mini-instruct-v1 with instruct wrapper

* Added KaLM_embedding_multilingual_mini_instruct_v1_5

* Added model to overview.py

* Fix Task Count Per Language Table in tasks.md

* resolve conflicts

* remove tasks.md

* Modified get_instruction funcion

* Added support for prompt dict in get_instruction

* fix lang code

* Address comments

* Delete mteb/models/check_models.py

* added prompts_dict support in InstructSentenceTransformerWrapper

* corrected instruction format

* corrected prompts format

* added correct instruction format

* fix implementation

* remove `if name main`

* add comment

---------

Co-authored-by: Roman Solomatin <[email protected]>

* fix: Reuploaded previously unavailable SNL datasets (#2819)

* fix: Reuploaded previously unavailable SNL datasets

closes #2477

* removed exceptions from tests

* temp fixes

* added temporary fix

* clean up commented out code

* format

* Update tasks & benchmarks tables

* 1.38.30

Automatically generated by python-semantic-release

* docs: Fix some typos in `docs/usage/usage.md` (#2835)

* Update usage.md

* Update usage.md

* Update docs/usage/usage.md

---------

Co-authored-by: Isaac Chung <[email protected]>

* model: Add custom instructions for GigaEmbeddings (#2836)

* add custom instructions

* fixed

* lint

* fix last instruction

---------

Co-authored-by: Kolodin Egor <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* try adding init

* add init in audio pc task eng

* all audio tasks init

* remove script test

---------

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions <[email protected]>
Co-authored-by: namespace-Pt <[email protected]>
Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Alexey Vatolin <[email protected]>
Co-authored-by: Imene Kerboua <[email protected]>
Co-authored-by: Ömer Veysel Çağatan <[email protected]>
Co-authored-by: Munot Ayush Sunil <[email protected]>
Co-authored-by: 24September <[email protected]>
Co-authored-by: wang.yuqi <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Feiyang <[email protected]>
Co-authored-by: Thomas van Dongen <[email protected]>
Co-authored-by: Paul Teiletche <[email protected]>
Co-authored-by: Mehran Sarmadi <[email protected]>
Co-authored-by: mehran <[email protected]>
Co-authored-by: Dawid Koterwas <[email protected]>
Co-authored-by: Wentao Wu <[email protected]>
Co-authored-by: Manveer Tamber <[email protected]>
Co-authored-by: malteos <[email protected]>
Co-authored-by: Egor <[email protected]>
Co-authored-by: Kolodin Egor <[email protected]>
Co-authored-by: Manuel Faysse <[email protected]>
Co-authored-by: Xin Zhang <[email protected]>
Co-authored-by: Hypothesis-Z <[email protected]>
Co-authored-by: zhangzeqing <[email protected]>
Co-authored-by: fangxiaoquan <[email protected]>
Co-authored-by: Li Lei <[email protected]>
Co-authored-by: annamodels <[email protected]>
Co-authored-by: Sadra Barikbin <[email protected]>
isaac-chung added a commit that referenced this pull request Jul 6, 2025
* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update Doubao-1.5-Embedding revision (#2613)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

* update revision

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* CI: fix table  (#2615)

* Update tasks & benchmarks tables

* Update gradio version (#2558)

* Update gradio version

Closes #2557

* bump gradio

* fix: Removed missing dataset for MTEB(Multilingual) and bumped version

We should probably just have done this earlier to ensure that the multilingual benchamrk is runable.

* CI: fix infinitely committing issue (#2616)

* fix token

* try to trigger

* add token

* test ci

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* remove test lines

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Add ScandiSent dataset (#2620)

* add scandisent dataset

* add to init

* typo

* lint

* 1.38.4

Automatically generated by python-semantic-release

* Format all citations (#2614)

* Fix errors in bibtex_citation

* Format all bibtex_citation fields

* format benchmarks

* fix format

* Fix tests

* add formatting script

* fix citations (#2628)

* Add Talemaader pair classification task (#2621)

Add talemaader pair classification task

* add Bilingual English-Danish parallel corpus from The Danish Medicines Agency (#2633)

* add Bilingual English-Danish parallel corpus from The Danish Medicines Agency

* bump dataset revision

* format bibtex

* format bibtex

* Remove irrelevant test (#2630)

remove irrelevant test

* Revert "CI: fix infinitely committing issue (#2616)" (#2636)

This reverts commit 82dcb3d.

* Update tasks & benchmarks tables

* Remove `typer` dependency from citation script (#2629)

remove typer dependency from citation script

* CI format citations (#2649)

* ci format citations

* add files

* remove from lint CI

* test lint

* test lint

* fix names

* fix: Update VisualSTS Aggregate task modalities (#2597)

* Update STS17MultilingualVisualSTS.py

* fix STSBenchmarkMultilingualVisualSTS

---------

Co-authored-by: Isaac Chung <[email protected]>

* 1.38.5

Automatically generated by python-semantic-release

* Add tests for leaderboard build (#2631)

* Add tests for leaderboard build

* add new action

* remove build tests from other actions

* fix tests

* correct exclusion of test

* added timeout constant

* fix: SIB200 machine translated > human translated (#2665)

As correctly pointed out in:

https://huggingface.co/datasets/mteb/sib200/discussions/1

* 1.38.6

Automatically generated by python-semantic-release

* fix: Update datasets wich can't be loaded with `datasets>=3.0`  (#2661)

fix: Update datasets wich can't be loaded with `datasets>=3.0` (#1619)

* reupload datasets

* fix loader

* remove commented code

* lint

* update pyproject dependencies

* rename model RELLE to CHAIN19 (#2671)

* Add relle
* defined model metadata for relle

* Add mteb/models/relle_models.py

* Update mteb/models/relle_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* lint after commit

run after "make lint"

* Add into model_modules

Add model into model_modules and lint check

* rename model
change model name

* rename model
change model name

---------

Co-authored-by: Roman Solomatin <[email protected]>

* 1.38.7

Automatically generated by python-semantic-release

* Update final version of Doubao-1.5-Embedding (Rename to Seed1.5-Embedding) (#2674)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

* update revision

* update Doubao-1.5-Embedding revision 3

* rename Doubao-1.5-Embedding to Seed1.5-Embedding

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* fix: Allow empty string for openai models (#2676)

* fix for empty string input to openai/text-embedding-3-large

* fix: Allow empty string in openai models

closes: #1650

* fix based on review

* Updated docstring

---------

Co-authored-by: ayush1298 <[email protected]>

* 1.38.8

Automatically generated by python-semantic-release

* Leaderboard: UI simplifications for menus (#2672)

* Leaderboard: UI simplifications for menus

Did a few things to improve the simplify the leaderboard UI.

Changes:
- Combined FAQ entries
- Created dropdowns in the select benchmark menu sidebar
- Removed reference to arena
- Removed reference to old leaderboard
- reduced size of select menu
- reduced the size of acknowledgements
- removed farsi from the selection (as it is a beta)

refactors:
- refactored to use a class for menu items
- refactored texts segments out of app.py

* fixed comment

* fixes for sizes

* fix modality for `OVENIT2TRetrieval` (#2678)

fix modality

* fix: `MTEB(Code, v1)`  languages (#2679)

fix code languages

* 1.38.9

Automatically generated by python-semantic-release

* Correction in docs (#2688)

* Fix for Openai_Text-Embedding3-Small (#2702)

* Fix for Openai_Text-Embedding3-Small

* better syntax for readability

* Fix for Openai_Text-Embedding3-Small (#2702)

* Fix for Openai_Text-Embedding3-Small

* better syntax for readability

* fix: Ensure that optional dependencies are compatible and if not state it (#2706)

Fixes mistakes introduced in #2424

It seems like many of these requirements doesn't exist (voyageai>=1.0.0). @ayush1298 I am hoping you could clear up how this happened?

* fix: Only install mteb into site packages (#2618)

* Restrict installation directory

* fix

* namespace false

* add star

* add pont

* fix import

* fix import

* add init files

* fix setuptools find

* fix image init

* add missing templates

---------

Co-authored-by: Roman Solomatin <[email protected]>

* 1.38.10

Automatically generated by python-semantic-release

* docs: Updated the PR template and improved submission docs (#2704)

* docs: Updated the PR template and improved submission docs

1) Updated PR template to only include checklist for datasets and models. The other checklists were essentially just tests.
2) I have updated the documentation for adding models. Notably I have split out the implementation segment, which I think makes it more readable.
3) Required that you argue for a dataset before addition

fixes #2568

* Apply suggestions from code review

Co-authored-by: Isaac Chung <[email protected]>

---------

Co-authored-by: Isaac Chung <[email protected]>

* fix: Remove models from the leaderboard (#2705)

* fix: Remove models from the leaderboard

I remove both models from the leaderboard by unlinking them from the import tree. I think this is the easiest way to add a model that not currently public.

* format

* 1.38.11

Automatically generated by python-semantic-release

* fix: Rename gemini-embedding-exp-03-07 to gemini-embedding-001 (#2711)

* Rename gemini-embedding-exp-03-07 to gemini-embedding-001

* update referenfe link to the vertexAI API doc

* 1.38.12

Automatically generated by python-semantic-release

* fix: Integrate `lightonai/GTE-ModernColBERT-v1` (#2708)

* fix: Integrate `lightonai/GTE-ModernColBERT-v1`

Fixes #2673

* fixes based on corrections

* 1.38.13

Automatically generated by python-semantic-release

* docs: fix number of tasks for eng, v2 in docs (#2720)

* fix: Added potion-multilingual-128M (#2717)

* Added ModelMeta for potion-multilingual-128M

* Fixed linting

* Fixed linting

* Updated date

* 1.38.14

Automatically generated by python-semantic-release

* Update the max tokens for gemini-embedding-001 (#2725)

* fix: Ara and ben classification dataset cleaning (#2632)

* Improve classification datasets quality for ara and ben langs

* add missing AJGT

* fix format

* change ajgt description

* Fix numbers in description, add link to pull request

* Add too short filter

* Link in markdown format

* Update tasks & benchmarks tables

* fix: Update Seed1.5-Embedding API (#2724)

* update seed1.5-embedding api

* update seed1.5-embedding api

* update Seed1.5-Embedding API

* update Seed1.5-Embedding resolve comments

* update Seed1.5-Embedding lint

* Update mteb/models/seed_models.py

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* 1.38.15

Automatically generated by python-semantic-release

* fix: Add vidore v2 benchmarks (#2713)

* adding vidore benchmarks

* fix typo

* clean vidore names + per lang eval

* lint

* vidore names

* bibtex fix

* fix revision

* vidore v2 citation

* update citation format and fix per-language mappings

* lint: citations

* typo citations

* Update tasks & benchmarks tables

* 1.38.16

Automatically generated by python-semantic-release

* fix: `IndicQARetrieval` loader (#2729)

* fix indic qa

* add kwargs

* 1.38.17

Automatically generated by python-semantic-release

* fix: Promote Persian benchmark to v1 (#2707)

* Switch versioning from beta to v1 and add v1 to benchmark selector

* Update Farsi benchmark display name, task IDs, and metadata

* Add Hakim Model

* fix hakim version

* update

* make lint

* fix: Promote Persian benchmark to v1

---------

Co-authored-by: mehran <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Update tasks & benchmarks tables

* 1.38.18

Automatically generated by python-semantic-release

* Add ViDoRe combined benchmark and add to leaderboard side panel (#2732)

* add ViDoRe combined benchmark and add to leaderboard side panel

* Update benchmark_selector.py

* Update tasks & benchmarks tables

* fix: Rename display name of VDR (#2734)

* Update tasks & benchmarks tables

* 1.38.19

Automatically generated by python-semantic-release

* fix: Add colpali models family (#2721)

* add colpali models

* add colpali as framework

* add colpali as framework

* update metadata and add colsmol

* ix typos

* account for revision

* add training data info and lint

* modify meta

* correct colmodels meta and add colnomic 7b

* fix typo in toml (colpali subdeps)

* refine colmodel loading and metadata

* 1.38.20

Automatically generated by python-semantic-release

* fix: Correct embedding dimension for bge-m3 (#2738)

Fixes #2735

* 1.38.21

Automatically generated by python-semantic-release

* docs: Updated description of FEVER (#2745)

* docs: Updated description of FEVER

Update the description to state that the corpus is the same as fever as we have have [multiple questions on it](https://huggingface.co/datasets/mteb/climate-fever/discussions/2)

* minor

* Backfill task metadata for metadata for BigPatentClustering and AllegroReviews (#2755)

* big-patent

* allegro-reviews

* Update tasks & benchmarks tables

* Update Seed1.5 training data (#2749)

* update seed1.5 training data

* update seed1.5 training data

* fix: Update caltech101 (#2759)

* docs: Updated description of FEVER

Update the description to state that the corpus is the same as fever as we have have [multiple questions on it](https://huggingface.co/datasets/mteb/climate-fever/discussions/2)

* fix: Update Caltech101 to different source

Run both versions of one of the task using `nomic-ai/nomic-embed-text-v1.5` and both scores match:

### Old

```
{
  "dataset_revision": "851374102055782c84f89b1b4e9d128a6568847b",
  "task_name": "Caltech101",
  "mteb_version": "1.38.4",
  "scores": {
    "test": [
      {
        "accuracy": 0.897863,
```

### New
```
{
  "dataset_revision": "52439cf6d4f6ebf563d8cdc7f2c5371d9efd2686",
  "task_name": "Caltech101",
  "mteb_version": "1.38.4",
  "scores": {
    "test": [
      {
        "accuracy": 0.897929,
```

* 1.38.22

Automatically generated by python-semantic-release

* Add missing PatchCamelyon_labels.txt (#2756)

* ci: Delete cache in Model loading test only when model is loaded (#2761)

* only delete cache when model loaded

* testing it out

* fix: Add `cadet-embed-base-v1` (#2727)

* update

* update overview.py for models

* update

* update

* 1.38.23

Automatically generated by python-semantic-release

* Fixing Google embedding task type for STS (#2767)

The type `SIMILARITY` is invalid. Correct one: `SEMANTIC_SIMILARITY`. See https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/task-types#supported_task_types

* docs: Leaderboard simplifications (#2764)

* docs: Leaderboard simplifications

Simplified sidebar, notably:

1) Combined Language and Regional (since these are all languages)
2) Folded all (With Visual document retrieval then images start to take up a lot of space)
3) Removed legacy and instead added "Other" in language, where I moved "English Legacy"

I also restructured the code so that nesting is easier.

Is it also possible to create a seperate section (see dummy screenshot)

* refactor to reduce nesting

* format

* fix: add xet support (#2603)

* add xet version

* add doc comment

* change xet requirements

* Update docs/usage/usage.md

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* 1.38.24

Automatically generated by python-semantic-release

* fix: Update giga embeddings (#2774)

* update giga embeddings

* update giga embeddings

---------

Co-authored-by: Kolodin Egor <[email protected]>

* ci: add new prefixes to releases (#2766)

add new prefixes

* 1.38.25

Automatically generated by python-semantic-release

* fix: Update Caltech101 datasets to latest revision [v1] (#2778)

* fix: Update Caltech101 datasets to latest revision [v2]

 fixes: #2770
Fixes the issue, but only in v1

```
# tested using:

task: mteb.AbsTask = mteb.get_task("Caltech101ZeroShot")
task.load_data()
task.get_candidate_labels()
```

* fix rev

* 1.38.26

Automatically generated by python-semantic-release

* fix: CachedEmbeddingWrapper issues in both documentation and code (#2779)

Fixes #2772

* 1.38.27

Automatically generated by python-semantic-release

* dataset: Add miracl vision (#2736)

* add miracl vision

* add miracl vision

* ruff

* cast

* image

* image

* add langs

* add langs

* add langs

* add langs

* descriptive stats

* lint

* lint

* lint

* remove com

* Update tasks & benchmarks tables

* model: Add Qwen3 Embedding model (#2769)

* Init code

* Remove extra config and lint code

* use sentence transformer

* add revisions

* fix lint

* Apply suggestions from code review

Co-authored-by: Roman Solomatin <[email protected]>

* fix lint

* add framework

---------

Co-authored-by: Roman Solomatin <[email protected]>

* bump ruff (#2784)

* Update issue and pr templates (#2782)

* Update issue templates

* Update bug_report.md

* test yaml template

* add templates

* update templates

* add emojis

* fix typo

* Apply suggestions from code review

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* update issue titles

* update PR template

* remove PR templates

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* model: Add GeoGPT-Research-Project/GeoEmbedding (#2773)

* add model: geogpt_models

* update geogpt_models

* use InstructSentenceTransformerWrapper

* resolve pylint warning

* format geogpt_models.py

* Update mteb/models/geogpt_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/models/geogpt_models.py

---------

Co-authored-by: zhangzeqing <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* model: add fangxq/XYZ-embedding (#2741)

* add xyz model

* add xyz model

* add xyz model

* update

* update

* update

* update

* update

* update

* update

* lint

---------

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* ci: fix config error for semantic release (#2800)

discussed in: #2796

* dataset: Add R2MED Benchmark (#2795)

* Add files via upload

* Add files via upload

* Update benchmarks.py

* Update __init__.py

* Add files via upload

* Update R2MEDRetrieval.py

* Update run_mteb_r2med.py

* Delete scripts/run_mteb_r2med.py

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Add files via upload

* Delete mteb/descriptive_stats/Retrieval/R2MEDRetrieval.json

* Add files via upload

* Add files via upload

* Add files via upload

* Update R2MEDRetrieval.py

* Add files via upload

* Add files via upload

* Add files via upload

* Add files via upload

* format citations

* Update R2MEDRetrieval.py

* Add files via upload

* Add files via upload

---------

Co-authored-by: Li Lei <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* Update tasks & benchmarks tables

* Update training datasets of GeoGPT-Research-Project/GeoEmbedding (#2802)

update training datasets

Co-authored-by: zhangzeqing <[email protected]>

* fix: Add adapted_from to Cmedqaretrieval (#2806)

* fix: Add adapted_from to Cmedqaretrieval

Also snuck in a fix with form=None, which is no longer valid, but was still used in a few places.

* format

* 1.38.28

Automatically generated by python-semantic-release

* fix: Adding client arg to init method of OpenAI models wrapper (#2803)

* Adding OpenAI client arg to init method (e.g., for already initialized AzureOpenAI client)

To use OpenAI embedding models via Azure, the model wrapper needs to be initialized with a different client.

* Update mteb/models/openai_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/models/openai_models.py

* remove comment and format

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* model: Add annamodels/LGAI-Embedding-Preview (#2810)

Add LGAI-Embedding

- Add mteb/models/lgai_embedding_models.py

- defined model metadata

* fix: Ensure bright uses the correct revision (#2812)

fixes #2811

* 1.38.29

Automatically generated by python-semantic-release

* add description to issue template (#2817)

* add description to template

* fix typo

* model: Added 3 HIT-TMG's KaLM-embedding models (#2478)

* Added HIT-TMG_KaLM-embedding-multilingual-mini-instruct-v1 with instruct wrapper

* Added KaLM_embedding_multilingual_mini_instruct_v1_5

* Added model to overview.py

* Fix Task Count Per Language Table in tasks.md

* resolve conflicts

* remove tasks.md

* Modified get_instruction funcion

* Added support for prompt dict in get_instruction

* fix lang code

* Address comments

* Delete mteb/models/check_models.py

* added prompts_dict support in InstructSentenceTransformerWrapper

* corrected instruction format

* corrected prompts format

* added correct instruction format

* fix implementation

* remove `if name main`

* add comment

---------

Co-authored-by: Roman Solomatin <[email protected]>

* fix: Reuploaded previously unavailable SNL datasets (#2819)

* fix: Reuploaded previously unavailable SNL datasets

closes #2477

* removed exceptions from tests

* temp fixes

* added temporary fix

* clean up commented out code

* format

* Update tasks & benchmarks tables

* 1.38.30

Automatically generated by python-semantic-release

* docs: Fix some typos in `docs/usage/usage.md` (#2835)

* Update usage.md

* Update usage.md

* Update docs/usage/usage.md

---------

Co-authored-by: Isaac Chung <[email protected]>

* model: Add custom instructions for GigaEmbeddings (#2836)

* add custom instructions

* fixed

* lint

* fix last instruction

---------

Co-authored-by: Kolodin Egor <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* model: add Seed-1.6-embedding model (#2841)

* add Seed-1.6-embedding model

* Update seed_1_6_embedding_models.py

* update model meta info

* support image encoder interface

* error fix

* fix: format seed_1_6_embedding_models.py with Ruff

* fix: Update model selection for the leaderboard (#2855)

* fix: Update model selection for the leaderboard

fixes #2834

This removed the lower bound selection, but generally I don't think people should care about the models being too small.

* fix 1M --> 1B

* format

* rename model_size -> max_model_size

* 1.38.31

Automatically generated by python-semantic-release

* fix: update training dataset info of Seed-1.6-embedding model  (#2857)

update seed1.6 model training data info

* 1.38.32

Automatically generated by python-semantic-release

* add jinav4 model meta (#2858)

* add model meta

* linting

* fix: add check for code lora

* fix: apply review comments

* fix: prompt validation for tasks with `-` (#2846)

* fix prompt validation

* fix task name split correctly

* add docstring for test

* 1.38.33

Automatically generated by python-semantic-release

* model: Adding Sailesh97/Hinvec (#2842)

* Adding Hinvec Model's Meta data.

* Adding hinvec_model.py

* Update mteb/models/hinvec_models.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* formated code with Black and lint with Ruff

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Bump gradio to fix leaderboard sorting (#2866)

Bump gradio

* model: Adding nvidia/llama-nemoretriever-colembed models (#2861)

* nvidia_llama_nemoretriever_colembed

* correct 3b reference

* lint fix

* add training data and license for nvidia/llama_nemoretriever_colembed

* lint

---------

Co-authored-by: Isaac Chung <[email protected]>

* rename seed-1.6-embedding to seed1.6-embedding (#2870)

* fix tests to be compatible with `SentenceTransformers` `v5` (#2875)

* fix sbert `v5`

* add comment

* model: add listconranker modelmeta (#2874)

* add listconranker modelmeta

* fix bugs

* use linter

* lint

---------

Co-authored-by: Roman Solomatin <[email protected]>

* model: add kalm_models ModelMeta (new PR) (#2853)

* feat: add KaLM_Embedding_X_0605 in kalm_models

* Update kalm_models.py for lint format

---------

Co-authored-by: xinshuohu <[email protected]>

* Comment kalm model (#2877)

comment kalm model

* Add and fix some Japanese datasets: ANLP datasets, JaCWIR, JQaRA (#2872)

* Add JaCWIR and JQaRA for reranking

* Fix ANLP Journal datasets

* Add NLPJournalAbsArticleRetrieval and JaCWIRRetrieval

* tackle test cases

* Remove _evaluate_subset usage

* Separate v1 and v2

* Update info for NLP Journal datasets

* Update tasks & benchmarks tables

* model: add Hakim and TookaSBERTV2 models (#2826)

* add tooka v2s

* add mcinext models

* update mcinext.py

* Apply PR review suggestions

* Update mteb/models/mcinext_models.py

---------

Co-authored-by: mehran <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: namespace-Pt <[email protected]>
Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: github-actions <[email protected]>
Co-authored-by: Alexey Vatolin <[email protected]>
Co-authored-by: Imene Kerboua <[email protected]>
Co-authored-by: Ömer Veysel Çağatan <[email protected]>
Co-authored-by: Munot Ayush Sunil <[email protected]>
Co-authored-by: 24September <[email protected]>
Co-authored-by: wang.yuqi <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Feiyang <[email protected]>
Co-authored-by: Thomas van Dongen <[email protected]>
Co-authored-by: Paul Teiletche <[email protected]>
Co-authored-by: Mehran Sarmadi <[email protected]>
Co-authored-by: mehran <[email protected]>
Co-authored-by: Dawid Koterwas <[email protected]>
Co-authored-by: Wentao Wu <[email protected]>
Co-authored-by: Manveer Tamber <[email protected]>
Co-authored-by: malteos <[email protected]>
Co-authored-by: Egor <[email protected]>
Co-authored-by: Kolodin Egor <[email protected]>
Co-authored-by: Manuel Faysse <[email protected]>
Co-authored-by: Xin Zhang <[email protected]>
Co-authored-by: Hypothesis-Z <[email protected]>
Co-authored-by: zhangzeqing <[email protected]>
Co-authored-by: fangxiaoquan <[email protected]>
Co-authored-by: Li Lei <[email protected]>
Co-authored-by: annamodels <[email protected]>
Co-authored-by: Sadra Barikbin <[email protected]>
Co-authored-by: Quan Yuhan <[email protected]>
Co-authored-by: Quan Yuhan <[email protected]>
Co-authored-by: Mohammad Kalim Akram <[email protected]>
Co-authored-by: Sailesh Panda <[email protected]>
Co-authored-by: bschifferer <[email protected]>
Co-authored-by: tutuDoki <[email protected]>
Co-authored-by: Xinshuo Hu <[email protected]>
Co-authored-by: xinshuohu <[email protected]>
Co-authored-by: lsz05 <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Samoed added a commit that referenced this pull request Jul 10, 2025
* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* CI: fix table  (#2615)

* Update tasks & benchmarks tables

* Update gradio version (#2558)

* Update gradio version

Closes #2557

* bump gradio

* fix: Removed missing dataset for MTEB(Multilingual) and bumped version

We should probably just have done this earlier to ensure that the multilingual benchamrk is runable.

* CI: fix infinitely committing issue (#2616)

* fix token

* try to trigger

* add token

* test ci

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* remove test lines

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Add ScandiSent dataset (#2620)

* add scandisent dataset

* add to init

* typo

* lint

* 1.38.4

Automatically generated by python-semantic-release

* Format all citations (#2614)

* Fix errors in bibtex_citation

* Format all bibtex_citation fields

* format benchmarks

* fix format

* Fix tests

* add formatting script

* fix citations (#2628)

* Add Talemaader pair classification task (#2621)

Add talemaader pair classification task

* add Bilingual English-Danish parallel corpus from The Danish Medicines Agency (#2633)

* add Bilingual English-Danish parallel corpus from The Danish Medicines Agency

* bump dataset revision

* format bibtex

* format bibtex

* Remove irrelevant test (#2630)

remove irrelevant test

* Revert "CI: fix infinitely committing issue (#2616)" (#2636)

This reverts commit 82dcb3d.

* Update tasks & benchmarks tables

* Remove `typer` dependency from citation script (#2629)

remove typer dependency from citation script

* CI format citations (#2649)

* ci format citations

* add files

* remove from lint CI

* test lint

* test lint

* fix names

* fix: Update VisualSTS Aggregate task modalities (#2597)

* Update STS17MultilingualVisualSTS.py

* fix STSBenchmarkMultilingualVisualSTS

---------

Co-authored-by: Isaac Chung <[email protected]>

* 1.38.5

Automatically generated by python-semantic-release

* Add tests for leaderboard build (#2631)

* Add tests for leaderboard build

* add new action

* remove build tests from other actions

* fix tests

* correct exclusion of test

* added timeout constant

* fix: SIB200 machine translated > human translated (#2665)

As correctly pointed out in:

https://huggingface.co/datasets/mteb/sib200/discussions/1

* 1.38.6

Automatically generated by python-semantic-release

* fix: Update datasets wich can't be loaded with `datasets>=3.0`  (#2661)

fix: Update datasets wich can't be loaded with `datasets>=3.0` (#1619)

* reupload datasets

* fix loader

* remove commented code

* lint

* update pyproject dependencies

* rename model RELLE to CHAIN19 (#2671)

* Add relle
* defined model metadata for relle

* Add mteb/models/relle_models.py

* Update mteb/models/relle_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* lint after commit

run after "make lint"

* Add into model_modules

Add model into model_modules and lint check

* rename model
change model name

* rename model
change model name

---------

Co-authored-by: Roman Solomatin <[email protected]>

* 1.38.7

Automatically generated by python-semantic-release

* Update final version of Doubao-1.5-Embedding (Rename to Seed1.5-Embedding) (#2674)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

* update revision

* update Doubao-1.5-Embedding revision 3

* rename Doubao-1.5-Embedding to Seed1.5-Embedding

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* fix: Allow empty string for openai models (#2676)

* fix for empty string input to openai/text-embedding-3-large

* fix: Allow empty string in openai models

closes: #1650

* fix based on review

* Updated docstring

---------

Co-authored-by: ayush1298 <[email protected]>

* 1.38.8

Automatically generated by python-semantic-release

* Leaderboard: UI simplifications for menus (#2672)

* Leaderboard: UI simplifications for menus

Did a few things to improve the simplify the leaderboard UI.

Changes:
- Combined FAQ entries
- Created dropdowns in the select benchmark menu sidebar
- Removed reference to arena
- Removed reference to old leaderboard
- reduced size of select menu
- reduced the size of acknowledgements
- removed farsi from the selection (as it is a beta)

refactors:
- refactored to use a class for menu items
- refactored texts segments out of app.py

* fixed comment

* fixes for sizes

* fix modality for `OVENIT2TRetrieval` (#2678)

fix modality

* fix: `MTEB(Code, v1)`  languages (#2679)

fix code languages

* 1.38.9

Automatically generated by python-semantic-release

* Correction in docs (#2688)

* Fix for Openai_Text-Embedding3-Small (#2702)

* Fix for Openai_Text-Embedding3-Small

* better syntax for readability

* Fix for Openai_Text-Embedding3-Small (#2702)

* Fix for Openai_Text-Embedding3-Small

* better syntax for readability

* fix: Ensure that optional dependencies are compatible and if not state it (#2706)

Fixes mistakes introduced in #2424

It seems like many of these requirements doesn't exist (voyageai>=1.0.0). @ayush1298 I am hoping you could clear up how this happened?

* fix: Only install mteb into site packages (#2618)

* Restrict installation directory

* fix

* namespace false

* add star

* add pont

* fix import

* fix import

* add init files

* fix setuptools find

* fix image init

* add missing templates

---------

Co-authored-by: Roman Solomatin <[email protected]>

* 1.38.10

Automatically generated by python-semantic-release

* docs: Updated the PR template and improved submission docs (#2704)

* docs: Updated the PR template and improved submission docs

1) Updated PR template to only include checklist for datasets and models. The other checklists were essentially just tests.
2) I have updated the documentation for adding models. Notably I have split out the implementation segment, which I think makes it more readable.
3) Required that you argue for a dataset before addition

fixes #2568

* Apply suggestions from code review

Co-authored-by: Isaac Chung <[email protected]>

---------

Co-authored-by: Isaac Chung <[email protected]>

* fix: Remove models from the leaderboard (#2705)

* fix: Remove models from the leaderboard

I remove both models from the leaderboard by unlinking them from the import tree. I think this is the easiest way to add a model that not currently public.

* format

* 1.38.11

Automatically generated by python-semantic-release

* fix: Rename gemini-embedding-exp-03-07 to gemini-embedding-001 (#2711)

* Rename gemini-embedding-exp-03-07 to gemini-embedding-001

* update referenfe link to the vertexAI API doc

* 1.38.12

Automatically generated by python-semantic-release

* fix: Integrate `lightonai/GTE-ModernColBERT-v1` (#2708)

* fix: Integrate `lightonai/GTE-ModernColBERT-v1`

Fixes #2673

* fixes based on corrections

* 1.38.13

Automatically generated by python-semantic-release

* docs: fix number of tasks for eng, v2 in docs (#2720)

* fix: Added potion-multilingual-128M (#2717)

* Added ModelMeta for potion-multilingual-128M

* Fixed linting

* Fixed linting

* Updated date

* 1.38.14

Automatically generated by python-semantic-release

* Update the max tokens for gemini-embedding-001 (#2725)

* fix: Ara and ben classification dataset cleaning (#2632)

* Improve classification datasets quality for ara and ben langs

* add missing AJGT

* fix format

* change ajgt description

* Fix numbers in description, add link to pull request

* Add too short filter

* Link in markdown format

* Update tasks & benchmarks tables

* fix: Update Seed1.5-Embedding API (#2724)

* update seed1.5-embedding api

* update seed1.5-embedding api

* update Seed1.5-Embedding API

* update Seed1.5-Embedding resolve comments

* update Seed1.5-Embedding lint

* Update mteb/models/seed_models.py

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* 1.38.15

Automatically generated by python-semantic-release

* fix: Add vidore v2 benchmarks (#2713)

* adding vidore benchmarks

* fix typo

* clean vidore names + per lang eval

* lint

* vidore names

* bibtex fix

* fix revision

* vidore v2 citation

* update citation format and fix per-language mappings

* lint: citations

* typo citations

* Update tasks & benchmarks tables

* 1.38.16

Automatically generated by python-semantic-release

* fix: `IndicQARetrieval` loader (#2729)

* fix indic qa

* add kwargs

* 1.38.17

Automatically generated by python-semantic-release

* fix: Promote Persian benchmark to v1 (#2707)

* Switch versioning from beta to v1 and add v1 to benchmark selector

* Update Farsi benchmark display name, task IDs, and metadata

* Add Hakim Model

* fix hakim version

* update

* make lint

* fix: Promote Persian benchmark to v1

---------

Co-authored-by: mehran <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Update tasks & benchmarks tables

* 1.38.18

Automatically generated by python-semantic-release

* Add ViDoRe combined benchmark and add to leaderboard side panel (#2732)

* add ViDoRe combined benchmark and add to leaderboard side panel

* Update benchmark_selector.py

* Update tasks & benchmarks tables

* fix: Rename display name of VDR (#2734)

* Update tasks & benchmarks tables

* 1.38.19

Automatically generated by python-semantic-release

* fix: Add colpali models family (#2721)

* add colpali models

* add colpali as framework

* add colpali as framework

* update metadata and add colsmol

* ix typos

* account for revision

* add training data info and lint

* modify meta

* correct colmodels meta and add colnomic 7b

* fix typo in toml (colpali subdeps)

* refine colmodel loading and metadata

* 1.38.20

Automatically generated by python-semantic-release

* fix: Correct embedding dimension for bge-m3 (#2738)

Fixes #2735

* 1.38.21

Automatically generated by python-semantic-release

* docs: Updated description of FEVER (#2745)

* docs: Updated description of FEVER

Update the description to state that the corpus is the same as fever as we have have [multiple questions on it](https://huggingface.co/datasets/mteb/climate-fever/discussions/2)

* minor

* Backfill task metadata for metadata for BigPatentClustering and AllegroReviews (#2755)

* big-patent

* allegro-reviews

* Update tasks & benchmarks tables

* Update Seed1.5 training data (#2749)

* update seed1.5 training data

* update seed1.5 training data

* fix: Update caltech101 (#2759)

* docs: Updated description of FEVER

Update the description to state that the corpus is the same as fever as we have have [multiple questions on it](https://huggingface.co/datasets/mteb/climate-fever/discussions/2)

* fix: Update Caltech101 to different source

Run both versions of one of the task using `nomic-ai/nomic-embed-text-v1.5` and both scores match:

### Old

```
{
  "dataset_revision": "851374102055782c84f89b1b4e9d128a6568847b",
  "task_name": "Caltech101",
  "mteb_version": "1.38.4",
  "scores": {
    "test": [
      {
        "accuracy": 0.897863,
```

### New
```
{
  "dataset_revision": "52439cf6d4f6ebf563d8cdc7f2c5371d9efd2686",
  "task_name": "Caltech101",
  "mteb_version": "1.38.4",
  "scores": {
    "test": [
      {
        "accuracy": 0.897929,
```

* 1.38.22

Automatically generated by python-semantic-release

* Add missing PatchCamelyon_labels.txt (#2756)

* ci: Delete cache in Model loading test only when model is loaded (#2761)

* only delete cache when model loaded

* testing it out

* fix: Add `cadet-embed-base-v1` (#2727)

* update

* update overview.py for models

* update

* update

* 1.38.23

Automatically generated by python-semantic-release

* Fixing Google embedding task type for STS (#2767)

The type `SIMILARITY` is invalid. Correct one: `SEMANTIC_SIMILARITY`. See https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/task-types#supported_task_types

* docs: Leaderboard simplifications (#2764)

* docs: Leaderboard simplifications

Simplified sidebar, notably:

1) Combined Language and Regional (since these are all languages)
2) Folded all (With Visual document retrieval then images start to take up a lot of space)
3) Removed legacy and instead added "Other" in language, where I moved "English Legacy"

I also restructured the code so that nesting is easier.

Is it also possible to create a seperate section (see dummy screenshot)

* refactor to reduce nesting

* format

* fix: add xet support (#2603)

* add xet version

* add doc comment

* change xet requirements

* Update docs/usage/usage.md

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* 1.38.24

Automatically generated by python-semantic-release

* fix: Update giga embeddings (#2774)

* update giga embeddings

* update giga embeddings

---------

Co-authored-by: Kolodin Egor <[email protected]>

* ci: add new prefixes to releases (#2766)

add new prefixes

* 1.38.25

Automatically generated by python-semantic-release

* fix: Update Caltech101 datasets to latest revision [v1] (#2778)

* fix: Update Caltech101 datasets to latest revision [v2]

 fixes: #2770
Fixes the issue, but only in v1

```
# tested using:

task: mteb.AbsTask = mteb.get_task("Caltech101ZeroShot")
task.load_data()
task.get_candidate_labels()
```

* fix rev

* 1.38.26

Automatically generated by python-semantic-release

* fix: CachedEmbeddingWrapper issues in both documentation and code (#2779)

Fixes #2772

* 1.38.27

Automatically generated by python-semantic-release

* dataset: Add miracl vision (#2736)

* add miracl vision

* add miracl vision

* ruff

* cast

* image

* image

* add langs

* add langs

* add langs

* add langs

* descriptive stats

* lint

* lint

* lint

* remove com

* Update tasks & benchmarks tables

* model: Add Qwen3 Embedding model (#2769)

* Init code

* Remove extra config and lint code

* use sentence transformer

* add revisions

* fix lint

* Apply suggestions from code review

Co-authored-by: Roman Solomatin <[email protected]>

* fix lint

* add framework

---------

Co-authored-by: Roman Solomatin <[email protected]>

* bump ruff (#2784)

* Update issue and pr templates (#2782)

* Update issue templates

* Update bug_report.md

* test yaml template

* add templates

* update templates

* add emojis

* fix typo

* Apply suggestions from code review

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* update issue titles

* update PR template

* remove PR templates

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* model: Add GeoGPT-Research-Project/GeoEmbedding (#2773)

* add model: geogpt_models

* update geogpt_models

* use InstructSentenceTransformerWrapper

* resolve pylint warning

* format geogpt_models.py

* Update mteb/models/geogpt_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/models/geogpt_models.py

---------

Co-authored-by: zhangzeqing <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* model: add fangxq/XYZ-embedding (#2741)

* add xyz model

* add xyz model

* add xyz model

* update

* update

* update

* update

* update

* update

* update

* lint

---------

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* ci: fix config error for semantic release (#2800)

discussed in: #2796

* dataset: Add R2MED Benchmark (#2795)

* Add files via upload

* Add files via upload

* Update benchmarks.py

* Update __init__.py

* Add files via upload

* Update R2MEDRetrieval.py

* Update run_mteb_r2med.py

* Delete scripts/run_mteb_r2med.py

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Add files via upload

* Delete mteb/descriptive_stats/Retrieval/R2MEDRetrieval.json

* Add files via upload

* Add files via upload

* Add files via upload

* Update R2MEDRetrieval.py

* Add files via upload

* Add files via upload

* Add files via upload

* Add files via upload

* format citations

* Update R2MEDRetrieval.py

* Add files via upload

* Add files via upload

---------

Co-authored-by: Li Lei <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* Update tasks & benchmarks tables

* Update training datasets of GeoGPT-Research-Project/GeoEmbedding (#2802)

update training datasets

Co-authored-by: zhangzeqing <[email protected]>

* fix: Add adapted_from to Cmedqaretrieval (#2806)

* fix: Add adapted_from to Cmedqaretrieval

Also snuck in a fix with form=None, which is no longer valid, but was still used in a few places.

* format

* 1.38.28

Automatically generated by python-semantic-release

* fix: Adding client arg to init method of OpenAI models wrapper (#2803)

* Adding OpenAI client arg to init method (e.g., for already initialized AzureOpenAI client)

To use OpenAI embedding models via Azure, the model wrapper needs to be initialized with a different client.

* Update mteb/models/openai_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/models/openai_models.py

* remove comment and format

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* model: Add annamodels/LGAI-Embedding-Preview (#2810)

Add LGAI-Embedding

- Add mteb/models/lgai_embedding_models.py

- defined model metadata

* fix: Ensure bright uses the correct revision (#2812)

fixes #2811

* 1.38.29

Automatically generated by python-semantic-release

* add description to issue template (#2817)

* add description to template

* fix typo

* model: Added 3 HIT-TMG's KaLM-embedding models (#2478)

* Added HIT-TMG_KaLM-embedding-multilingual-mini-instruct-v1 with instruct wrapper

* Added KaLM_embedding_multilingual_mini_instruct_v1_5

* Added model to overview.py

* Fix Task Count Per Language Table in tasks.md

* resolve conflicts

* remove tasks.md

* Modified get_instruction funcion

* Added support for prompt dict in get_instruction

* fix lang code

* Address comments

* Delete mteb/models/check_models.py

* added prompts_dict support in InstructSentenceTransformerWrapper

* corrected instruction format

* corrected prompts format

* added correct instruction format

* fix implementation

* remove `if name main`

* add comment

---------

Co-authored-by: Roman Solomatin <[email protected]>

* fix: Reuploaded previously unavailable SNL datasets (#2819)

* fix: Reuploaded previously unavailable SNL datasets

closes #2477

* removed exceptions from tests

* temp fixes

* added temporary fix

* clean up commented out code

* format

* Update tasks & benchmarks tables

* 1.38.30

Automatically generated by python-semantic-release

* docs: Fix some typos in `docs/usage/usage.md` (#2835)

* Update usage.md

* Update usage.md

* Update docs/usage/usage.md

---------

Co-authored-by: Isaac Chung <[email protected]>

* model: Add custom instructions for GigaEmbeddings (#2836)

* add custom instructions

* fixed

* lint

* fix last instruction

---------

Co-authored-by: Kolodin Egor <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* model: add Seed-1.6-embedding model (#2841)

* add Seed-1.6-embedding model

* Update seed_1_6_embedding_models.py

* update model meta info

* support image encoder interface

* error fix

* fix: format seed_1_6_embedding_models.py with Ruff

* fix: Update model selection for the leaderboard (#2855)

* fix: Update model selection for the leaderboard

fixes #2834

This removed the lower bound selection, but generally I don't think people should care about the models being too small.

* fix 1M --> 1B

* format

* rename model_size -> max_model_size

* 1.38.31

Automatically generated by python-semantic-release

* fix: update training dataset info of Seed-1.6-embedding model  (#2857)

update seed1.6 model training data info

* 1.38.32

Automatically generated by python-semantic-release

* add jinav4 model meta (#2858)

* add model meta

* linting

* fix: add check for code lora

* fix: apply review comments

* fix: prompt validation for tasks with `-` (#2846)

* fix prompt validation

* fix task name split correctly

* add docstring for test

* 1.38.33

Automatically generated by python-semantic-release

* model: Adding Sailesh97/Hinvec (#2842)

* Adding Hinvec Model's Meta data.

* Adding hinvec_model.py

* Update mteb/models/hinvec_models.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* formated code with Black and lint with Ruff

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Bump gradio to fix leaderboard sorting (#2866)

Bump gradio

* model: Adding nvidia/llama-nemoretriever-colembed models (#2861)

* nvidia_llama_nemoretriever_colembed

* correct 3b reference

* lint fix

* add training data and license for nvidia/llama_nemoretriever_colembed

* lint

---------

Co-authored-by: Isaac Chung <[email protected]>

* rename seed-1.6-embedding to seed1.6-embedding (#2870)

* fix tests to be compatible with `SentenceTransformers` `v5` (#2875)

* fix sbert `v5`

* add comment

* model: add listconranker modelmeta (#2874)

* add listconranker modelmeta

* fix bugs

* use linter

* lint

---------

Co-authored-by: Roman Solomatin <[email protected]>

* model: add kalm_models ModelMeta (new PR) (#2853)

* feat: add KaLM_Embedding_X_0605 in kalm_models

* Update kalm_models.py for lint format

---------

Co-authored-by: xinshuohu <[email protected]>

* Comment kalm model (#2877)

comment kalm model

* Add and fix some Japanese datasets: ANLP datasets, JaCWIR, JQaRA (#2872)

* Add JaCWIR and JQaRA for reranking

* Fix ANLP Journal datasets

* Add NLPJournalAbsArticleRetrieval and JaCWIRRetrieval

* tackle test cases

* Remove _evaluate_subset usage

* Separate v1 and v2

* Update info for NLP Journal datasets

* Update tasks & benchmarks tables

* model: add Hakim and TookaSBERTV2 models (#2826)

* add tooka v2s

* add mcinext models

* update mcinext.py

* Apply PR review suggestions

* Update mteb/models/mcinext_models.py

---------

Co-authored-by: mehran <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* dataset: Evalita dataset integration (#2859)

* Added DadoEvalCoarseClassification

* Removed unnecessary columns from DadoEvalCoarseClassification

* Added EmitClassification task

* added SardiStanceClassification task

* Added GeoLingItClassification task

* Added DisCoTexPairClassification tasks

* Added EmitClassification, DadoEvalCoarseClassification, GeoLingItClassification, SardiStanceClassification inside the inits

* changed import in DisCoTexPairClassification

* removed GeoLingItClassification dataset

* fixed citation formatting, missing metadata parameters and lint formatting

* - Added XGlueWRPReranking task
- Added missing __init__.py files

* fixed metadata in XGlueWRPReranking

* Added MKQARetrieval task

* fixed type in XGlueWRPReranking

* changed MKQARetrieval from  cross-lingual to monolingual

* formatted MKQARetrieval file

* removed unused const

---------

Co-authored-by: Mattia Sangermano <[email protected]>

* Update tasks & benchmarks tables

* fix: pin datasets version (#2892)

fix datasets version

* 1.38.34

Automatically generated by python-semantic-release

* merge main

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Isaac Chung <[email protected]>
Co-authored-by: github-actions <[email protected]>
Co-authored-by: Alexey Vatolin <[email protected]>
Co-authored-by: Imene Kerboua <[email protected]>
Co-authored-by: Ömer Veysel Çağatan <[email protected]>
Co-authored-by: Munot Ayush Sunil <[email protected]>
Co-authored-by: 24September <[email protected]>
Co-authored-by: namespace-Pt <[email protected]>
Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: wang.yuqi <[email protected]>
Co-authored-by: Feiyang <[email protected]>
Co-authored-by: Thomas van Dongen <[email protected]>
Co-authored-by: Paul Teiletche <[email protected]>
Co-authored-by: Mehran Sarmadi <[email protected]>
Co-authored-by: mehran <[email protected]>
Co-authored-by: Dawid Koterwas <[email protected]>
Co-authored-by: Wentao Wu <[email protected]>
Co-authored-by: Manveer Tamber <[email protected]>
Co-authored-by: malteos <[email protected]>
Co-authored-by: Egor <[email protected]>
Co-authored-by: Kolodin Egor <[email protected]>
Co-authored-by: Manuel Faysse <[email protected]>
Co-authored-by: Xin Zhang <[email protected]>
Co-authored-by: Hypothesis-Z <[email protected]>
Co-authored-by: zhangzeqing <[email protected]>
Co-authored-by: fangxiaoquan <[email protected]>
Co-authored-by: Li Lei <[email protected]>
Co-authored-by: annamodels <[email protected]>
Co-authored-by: Sadra Barikbin <[email protected]>
Co-authored-by: Quan Yuhan <[email protected]>
Co-authored-by: Quan Yuhan <[email protected]>
Co-authored-by: Mohammad Kalim Akram <[email protected]>
Co-authored-by: Sailesh Panda <[email protected]>
Co-authored-by: bschifferer <[email protected]>
Co-authored-by: tutuDoki <[email protected]>
Co-authored-by: Xinshuo Hu <[email protected]>
Co-authored-by: xinshuohu <[email protected]>
Co-authored-by: lsz05 <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: MattiaSangermano <[email protected]>
Co-authored-by: Mattia Sangermano <[email protected]>
Samoed added a commit that referenced this pull request Jul 10, 2025
* bump ruff (#2784)

* Update issue and pr templates (#2782)

* Update issue templates

* Update bug_report.md

* test yaml template

* add templates

* update templates

* add emojis

* fix typo

* Apply suggestions from code review

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* update issue titles

* update PR template

* remove PR templates

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* model: Add GeoGPT-Research-Project/GeoEmbedding (#2773)

* add model: geogpt_models

* update geogpt_models

* use InstructSentenceTransformerWrapper

* resolve pylint warning

* format geogpt_models.py

* Update mteb/models/geogpt_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/models/geogpt_models.py

---------

Co-authored-by: zhangzeqing <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* model: add fangxq/XYZ-embedding (#2741)

* add xyz model

* add xyz model

* add xyz model

* update

* update

* update

* update

* update

* update

* update

* lint

---------

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* ci: fix config error for semantic release (#2800)

discussed in: #2796

* dataset: Add R2MED Benchmark (#2795)

* Add files via upload

* Add files via upload

* Update benchmarks.py

* Update __init__.py

* Add files via upload

* Update R2MEDRetrieval.py

* Update run_mteb_r2med.py

* Delete scripts/run_mteb_r2med.py

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Add files via upload

* Delete mteb/descriptive_stats/Retrieval/R2MEDRetrieval.json

* Add files via upload

* Add files via upload

* Add files via upload

* Update R2MEDRetrieval.py

* Add files via upload

* Add files via upload

* Add files via upload

* Add files via upload

* format citations

* Update R2MEDRetrieval.py

* Add files via upload

* Add files via upload

---------

Co-authored-by: Li Lei <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* Update tasks & benchmarks tables

* Update training datasets of GeoGPT-Research-Project/GeoEmbedding (#2802)

update training datasets

Co-authored-by: zhangzeqing <[email protected]>

* fix: Add adapted_from to Cmedqaretrieval (#2806)

* fix: Add adapted_from to Cmedqaretrieval

Also snuck in a fix with form=None, which is no longer valid, but was still used in a few places.

* format

* 1.38.28

Automatically generated by python-semantic-release

* fix: Adding client arg to init method of OpenAI models wrapper (#2803)

* Adding OpenAI client arg to init method (e.g., for already initialized AzureOpenAI client)

To use OpenAI embedding models via Azure, the model wrapper needs to be initialized with a different client.

* Update mteb/models/openai_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/models/openai_models.py

* remove comment and format

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* model: Add annamodels/LGAI-Embedding-Preview (#2810)

Add LGAI-Embedding

- Add mteb/models/lgai_embedding_models.py

- defined model metadata

* fix: Ensure bright uses the correct revision (#2812)

fixes #2811

* 1.38.29

Automatically generated by python-semantic-release

* add description to issue template (#2817)

* add description to template

* fix typo

* model: Added 3 HIT-TMG's KaLM-embedding models (#2478)

* Added HIT-TMG_KaLM-embedding-multilingual-mini-instruct-v1 with instruct wrapper

* Added KaLM_embedding_multilingual_mini_instruct_v1_5

* Added model to overview.py

* Fix Task Count Per Language Table in tasks.md

* resolve conflicts

* remove tasks.md

* Modified get_instruction funcion

* Added support for prompt dict in get_instruction

* fix lang code

* Address comments

* Delete mteb/models/check_models.py

* added prompts_dict support in InstructSentenceTransformerWrapper

* corrected instruction format

* corrected prompts format

* added correct instruction format

* fix implementation

* remove `if name main`

* add comment

---------

Co-authored-by: Roman Solomatin <[email protected]>

* fix: Reuploaded previously unavailable SNL datasets (#2819)

* fix: Reuploaded previously unavailable SNL datasets

closes #2477

* removed exceptions from tests

* temp fixes

* added temporary fix

* clean up commented out code

* format

* Update tasks & benchmarks tables

* 1.38.30

Automatically generated by python-semantic-release

* docs: Fix some typos in `docs/usage/usage.md` (#2835)

* Update usage.md

* Update usage.md

* Update docs/usage/usage.md

---------

Co-authored-by: Isaac Chung <[email protected]>

* model: Add custom instructions for GigaEmbeddings (#2836)

* add custom instructions

* fixed

* lint

* fix last instruction

---------

Co-authored-by: Kolodin Egor <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* model: add Seed-1.6-embedding model (#2841)

* add Seed-1.6-embedding model

* Update seed_1_6_embedding_models.py

* update model meta info

* support image encoder interface

* error fix

* fix: format seed_1_6_embedding_models.py with Ruff

* fix: Update model selection for the leaderboard (#2855)

* fix: Update model selection for the leaderboard

fixes #2834

This removed the lower bound selection, but generally I don't think people should care about the models being too small.

* fix 1M --> 1B

* format

* rename model_size -> max_model_size

* 1.38.31

Automatically generated by python-semantic-release

* fix: update training dataset info of Seed-1.6-embedding model  (#2857)

update seed1.6 model training data info

* 1.38.32

Automatically generated by python-semantic-release

* add jinav4 model meta (#2858)

* add model meta

* linting

* fix: add check for code lora

* fix: apply review comments

* fix: prompt validation for tasks with `-` (#2846)

* fix prompt validation

* fix task name split correctly

* add docstring for test

* 1.38.33

Automatically generated by python-semantic-release

* model: Adding Sailesh97/Hinvec (#2842)

* Adding Hinvec Model's Meta data.

* Adding hinvec_model.py

* Update mteb/models/hinvec_models.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* formated code with Black and lint with Ruff

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Bump gradio to fix leaderboard sorting (#2866)

Bump gradio

* model: Adding nvidia/llama-nemoretriever-colembed models (#2861)

* nvidia_llama_nemoretriever_colembed

* correct 3b reference

* lint fix

* add training data and license for nvidia/llama_nemoretriever_colembed

* lint

---------

Co-authored-by: Isaac Chung <[email protected]>

* rename seed-1.6-embedding to seed1.6-embedding (#2870)

* fix tests to be compatible with `SentenceTransformers` `v5` (#2875)

* fix sbert `v5`

* add comment

* model: add listconranker modelmeta (#2874)

* add listconranker modelmeta

* fix bugs

* use linter

* lint

---------

Co-authored-by: Roman Solomatin <[email protected]>

* model: add kalm_models ModelMeta (new PR) (#2853)

* feat: add KaLM_Embedding_X_0605 in kalm_models

* Update kalm_models.py for lint format

---------

Co-authored-by: xinshuohu <[email protected]>

* Comment kalm model (#2877)

comment kalm model

* Add and fix some Japanese datasets: ANLP datasets, JaCWIR, JQaRA (#2872)

* Add JaCWIR and JQaRA for reranking

* Fix ANLP Journal datasets

* Add NLPJournalAbsArticleRetrieval and JaCWIRRetrieval

* tackle test cases

* Remove _evaluate_subset usage

* Separate v1 and v2

* Update info for NLP Journal datasets

* Update tasks & benchmarks tables

* model: add Hakim and TookaSBERTV2 models (#2826)

* add tooka v2s

* add mcinext models

* update mcinext.py

* Apply PR review suggestions

* Update mteb/models/mcinext_models.py

---------

Co-authored-by: mehran <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* dataset: Evalita dataset integration (#2859)

* Added DadoEvalCoarseClassification

* Removed unnecessary columns from DadoEvalCoarseClassification

* Added EmitClassification task

* added SardiStanceClassification task

* Added GeoLingItClassification task

* Added DisCoTexPairClassification tasks

* Added EmitClassification, DadoEvalCoarseClassification, GeoLingItClassification, SardiStanceClassification inside the inits

* changed import in DisCoTexPairClassification

* removed GeoLingItClassification dataset

* fixed citation formatting, missing metadata parameters and lint formatting

* - Added XGlueWRPReranking task
- Added missing __init__.py files

* fixed metadata in XGlueWRPReranking

* Added MKQARetrieval task

* fixed type in XGlueWRPReranking

* changed MKQARetrieval from  cross-lingual to monolingual

* formatted MKQARetrieval file

* removed unused const

---------

Co-authored-by: Mattia Sangermano <[email protected]>

* Update tasks & benchmarks tables

* fix: pin datasets version (#2892)

fix datasets version

* 1.38.34

Automatically generated by python-semantic-release

* fix model implementations

* fix tasks

* add metrics

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Hypothesis-Z <[email protected]>
Co-authored-by: zhangzeqing <[email protected]>
Co-authored-by: fangxiaoquan <[email protected]>
Co-authored-by: Li Lei <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions <[email protected]>
Co-authored-by: malteos <[email protected]>
Co-authored-by: annamodels <[email protected]>
Co-authored-by: Munot Ayush Sunil <[email protected]>
Co-authored-by: Sadra Barikbin <[email protected]>
Co-authored-by: Isaac Chung <[email protected]>
Co-authored-by: Egor <[email protected]>
Co-authored-by: Kolodin Egor <[email protected]>
Co-authored-by: Quan Yuhan <[email protected]>
Co-authored-by: Quan Yuhan <[email protected]>
Co-authored-by: Mohammad Kalim Akram <[email protected]>
Co-authored-by: Sailesh Panda <[email protected]>
Co-authored-by: bschifferer <[email protected]>
Co-authored-by: tutuDoki <[email protected]>
Co-authored-by: Xinshuo Hu <[email protected]>
Co-authored-by: xinshuohu <[email protected]>
Co-authored-by: lsz05 <[email protected]>
Co-authored-by: Mehran Sarmadi <[email protected]>
Co-authored-by: mehran <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: MattiaSangermano <[email protected]>
Co-authored-by: Mattia Sangermano <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new model Questions related to adding a new model to the benchmark
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants