Skip to content

Conversation

Samoed
Copy link
Member

@Samoed Samoed commented Apr 19, 2025

Encodechka is an older Russian benchmark that served as the main evaluation suite before the introduction of MTEB(rus)

CC @avidale

MTEB results:

Model RUParaPhraserSTS RuSTSBenchmarkSTS XNLI InappropriatenessClassificationv2 RuNLUIntentClassification RuToxicOKMLCUPClassification SentiRuEval2016
deepvk/USER-bge-m3 76.36 88.11 0.583967 0.66502 0.8271 0.700367
sergeyzh/rubert-tiny-turbo 72.15 78.47 0.5511 0.56762 0.7484 0.597567
cointegrated/LaBSE-en-ru 65.87 66.94 0.572433 0.6027 0.74645 0.5599

Encodechka results:

model PI (RUParaPhraserSTS) STS (RuSTSBenchmarkSTS) NLI (XNLI) IA (InappropriatenessClassificationv2) IC (RuNLUIntentClassification rus) ICX (RuNLUIntentClassification eng-rus) TI (RuToxicOKMLCUPClassification) SA (SentiRuEval2016)
deepvk/USER-bge-m3 0.76 0.87 0.58 0.79 0.81 0.78 0.97 0.82
sergeyzh/rubert-tiny-turbo 0.72 0.83 0.48 0.76 0.78 0.68 0.95 0.79
cointegrated/LaBSE-en-ru 0.66 0.79 0.43 0.77 0.79 0.77 0.95 0.76

Full results embeddings-benchmark/results#182

We see differences across most tasks due to differing classification settings. Encodechka uses cross-validation on the full training split with LogisticRegression(max_iter=10_000), whereas MTEB uses a fixed number of samples per label and LogisticRegression(max_iter=100). That said, I believe the overall ranking should remain similar.

As for the InappropriatenessClassification dataset, it appears to be different from what Encodechka used. I wasn’t able to match it with any version from the original repository.

Code Quality

  • Code Formatted: Format the code using make lint to maintain consistent style.

Documentation

  • Updated Documentation: Add or update documentation to reflect the changes introduced in this PR.

Testing

  • New Tests Added: Write tests to cover new functionality. Validate with make test-with-coverage.
  • Tests Passed: Run tests locally using make test or make test-with-coverage to ensure no existing functionality is broken.

Adding datasets checklist

Reason for dataset addition: ...

  • I have run the following models on the task (adding the results to the pr). These can be run using the mteb -m {model_name} -t {task_name} command.
    • sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
    • intfloat/multilingual-e5-small
  • I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).
  • If the dataset is too big (e.g. >2048 examples), considering using self.stratified_subsampling() under dataset_transform()
  • I have filled out the metadata object in the dataset file (find documentation on it here).
  • Run tests locally to make sure nothing is broken using make test.
  • Run the formatter to format the code using make lint.

Adding a model checklist

  • I have filled out the ModelMeta object to the extent possible
  • I have ensured that my model can be loaded using
    • mteb.get_model(model_name, revision) and
    • mteb.get_model_meta(model_name, revision)
  • I have tested the implementation works on a representative set of tasks.

@Samoed Samoed added the new benchmark Issues related to adding a new benchmark label Apr 19, 2025
Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, not much to add here

@KennethEnevoldsen KennethEnevoldsen changed the title add Encodechka benchmark fix: Add Encodechka benchmark Apr 26, 2025
# Conflicts:
#	mteb/benchmarks/benchmarks.py
#	mteb/leaderboard/benchmark_selector.py
@Samoed Samoed enabled auto-merge (squash) April 27, 2025 17:28
@Samoed Samoed disabled auto-merge April 27, 2025 17:44
@Samoed Samoed enabled auto-merge (squash) April 27, 2025 17:47
@Samoed Samoed merged commit 0737e78 into main Apr 27, 2025
8 checks passed
@Samoed Samoed deleted the encodechka branch April 27, 2025 17:54
@@ -1599,14 +1602,14 @@
document undestanding, visual STS, and CV-centric tasks.""",
reference="",
contacts=["gowitheflow-1998", "isaac-chung"],
citation="""@article{xiao2025mieb,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Samoed @KennethEnevoldsen this seems unrelated to this PR. I had previously updated this to match the MTEB paper's bibtex style. Would appreciate it if you could revert this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix it in separate PR then. I don't know why it was changed

Samoed added a commit that referenced this pull request May 3, 2025
* SpeedTask add deprecated warning (#2493)

* Docs: Update README.md (#2494)

Update README.md

* fix transformers version for now (#2504)

* Fix typos (#2509)

* ci: refactor TaskMetadata eval langs test (#2501)

* refactor eval langs test

* function returns None

* add hard negaties tasks in _HISTORIC_DATASETS

* rename to ImageClustering folder (#2516)

rename folder

* Clean up trailing spaces citation (#2518)

* rename folder

* trailing spaces

* missed one

* [mieb] Memotion preprocessing code made more robust and readable (#2519)

* fix: validate lang code in ModelMeta (#2499)

* Update pyproject.toml (#2522)

* 1.36.38

Automatically generated by python-semantic-release

* Fix leaderboard version (#2524)

* fix gradio leaderboard run

* update docs

* Fix gte-multilingual-base embed_dim (#2526)

* [MIEB] Specify only the multilingual AggTask for MIEB-lite (#2539)

specify only the multilingual AggTask

* [mieb] fix hatefulmemes (#2531)

* fix hatefulmeme

* add to description and use polars instead

---------

Co-authored-by: Isaac Chung <[email protected]>

* Model conan (#2534)

* conan_models

* conan_models

* refactor code

* refactor code

---------

Co-authored-by: shyuli <[email protected]>

* fix: Update mteb.get_tasks with an exclude_aggregate parameter to exclude aggregate tasks (#2536)

* Implement task.is_aggregate check

* Add `mteb.get_tasks` parameter `include_aggregate` to exclude aggregate tasks if needed

* Update mteb.run with the new `task.is_aggregate` parameter

* Add tests

* Ran linter

* Changed logic to `exclude_aggregate`

* Updated from review comments

* Exclude aggregate by default false in get_tasks

* 1.36.39

Automatically generated by python-semantic-release

* docs: Add MIEB citation in benchmarks (#2544)

Add MIEB citation in benchmarks

* Add 2 new Vietnamese Retrieval Datasets (#2393)

* [ADD] 2 new Datasets

* [UPDATE] Change bibtext_citation for GreenNodeTableMarkdownRetrieval as TODO

* [UPDATE] Change bibtext_citation for ZacLegalTextRetrieval as TODO

* Update tasks table

* fix: CacheWrapper per task (#2467)

* feat: CacheWrapper per task

* refactor logic

* update documentation

---------

Co-authored-by: Florian Rottach <[email protected]>

* 1.36.40

Automatically generated by python-semantic-release

* misc: move MMTEB scripts and notebooks to separate repo (#2546)

move mmteb scripts and notebooks to separate repo

* fix: Update requirements in JinaWrapper (#2548)

fix: Update package requirements in JinaWrapper for einops and flash_attn

* 1.36.41

Automatically generated by python-semantic-release

* Docs: Add MIEB to README (#2550)

Add MIEB to README

* Add xlm_roberta_ua_distilled (#2547)

* defined model metadata for xlm_roberta_ua_distilled

* Update mteb/models/ua_sentence_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* included ua_sentence_models.py in overview.py

* applied linting, added missing fields in ModelMeta

* applied linting

---------

Co-authored-by: Roman Solomatin <[email protected]>

* fix me5 trainind data config to include xquad dataset (#2552)

* fix: me5 trainind data config to include xquad dataset

* Update mteb/models/e5_models.py

upddate: xquad key name

Co-authored-by: Roman Solomatin <[email protected]>

* fix: ME5_TRAINING_DATA format

---------

Co-authored-by: Roman Solomatin <[email protected]>

* feat: Added dataframe utilities to BenchmarkResults (#2542)

* fix: Added dataframe utilities to BenchmarkResults

- Added `get_results_table`. I was considering renaming it to `to_dataframe` to align with `tasks.to_dataframe`. WDYT?
- Added a tests for ModelResults and BenchmarksResults
- Added a few utility functions where needed
- Added docstring throughout ModelResults and BenchmarksResults
- Added todo comment for missing aspects - mostly v2 - but we join_revisions seems like it could use an update before then.

Prerequisite for #2454:

@ayush1298 can I ask you to review this PR as well? I hope this give an idea of what I was hinting at. Sorry that it took a while. I wanted to make sure to get it right.

* refactor to to_dataframe and combine common dependencies

* ibid

* fix revision joining after discussion with @x-tabdeveloping

* remove strict=True for zip() as it is a >3.9 feature

* updated mock cache

* 1.37.0

Automatically generated by python-semantic-release

* fix e5_R_mistral_7b (#2490)

* fix e5_R_mistral_7b

* change wrapper

* address comments

* Added kwargs for pad_token

* correct lang format

* address comments

* add revision

---------

Co-authored-by: Roman Solomatin <[email protected]>

* fix unintentional working of filters on leaderboard (#2535)

* fix unintentional working of filters on leaderboard

* address comments

* make lint

* address comments

* rollback unnecessary changes

* feat: UI Overhaul (#2549)

* Bumped gradio version to latest

* Added new Gradio table functionality to leaderboard

* Removed search bar

* Changed color scheme in plot to match the table

* Added new benchmark selector in sidebar

* Changed not activated button type to secondary

* Short-circuited callbacks that are based on language selection

* Re-added column width calculation since it got messed up

* Commented out gradient for per-task table as it slowed things down substantially

* Styling and layout updates

* Adjusted comments according to reviews

* Converted all print statements to logger.debug

* Removed pydantic version fix

* Ran linting

* Remove commented out code

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Moved English,v1 to Legacy section

* Closed the benchmark sharing accordion by default

* Adjusted markdown blocks according to suggestions

* Ran linter

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* 1.38.0

Automatically generated by python-semantic-release

* add USER2 (#2560)

* add user2

* add training code

* update prompts

* Fix leaderboard entry for BuiltBench (#2563)

Fix leaderboard entry for BuiltBench (#2562)

Co-authored-by: Mehrzad Shahin-Moghadam <[email protected]>

* fix: jasper models embeddings having nan values (#2481)

* 1.38.1

Automatically generated by python-semantic-release

* fix frida datasets (#2565)

* Add relle (#2564)

* Add relle
* defined model metadata for relle

* Add mteb/models/relle_models.py

* Update mteb/models/relle_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* lint after commit

run after "make lint"

* Add into model_modules

Add model into model_modules and lint check

---------

Co-authored-by: Roman Solomatin <[email protected]>

* Backfill task metadata for metadata for GermanDPR and GermanQuAD (#2566)

* Add metadata for GermanDPR and GermanQuAD

* PR improvements

* Update tasks table

* Add  ModelMeta for CodeSearch-ModernBERT-Crow-Plus (#2570)

* Add files via upload

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update overview.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update mteb/models/shuu_model.py

Co-authored-by: Roman Solomatin <[email protected]>

---------

Co-authored-by: Roman Solomatin <[email protected]>

* Docs: Improve MIEB docs (#2569)

* Add missing annotations (#2498)

* Update tasks table

* move icon & name to benchmark dataclass (#2573)

* Remove the comments from ImageEncoder (#2579)

* fix: Add Encodechka benchmark (#2561)

* add tasks

* add benchmark

* fix imports

* update stsb split

* Update tasks table

* 1.38.2

Automatically generated by python-semantic-release

* fix FlagEmbedding package name (#2588)

* fix codecarbon version (#2587)

* Add MIEB image only benchmark (#2590)

* add vision only bench

* add description

* correct zs task modalities

* specify tasks param

* Add image only MIEB benchmark to LB left panel (#2596)

* Update benchmarks.py

* make lint

* add to left side bar

* update Doubao-1.5-Embedding (#2575)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* fix: Add WebSSL models (#2604)

* add 2 web SSL dino models

* add models from collection and revisions

* update memory_usage_mb and embed dim

* use automodel instead

* fix mieb citation (#2606)

* 1.38.3

Automatically generated by python-semantic-release

* Update Doubao-1.5-Embedding (#2611)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* CI: update benchmark table (#2609)

* update benchmark table

* fix table

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update Doubao-1.5-Embedding revision (#2613)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

* update revision

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* CI: fix table  (#2615)

* Update tasks & benchmarks tables

* fixes

* Update gradio version (#2558)

* Update gradio version

Closes #2557

* bump gradio

* fix: Removed missing dataset for MTEB(Multilingual) and bumped version

We should probably just have done this earlier to ensure that the multilingual benchamrk is runable.

* CI: fix infinitely committing issue (#2616)

* fix token

* try to trigger

* add token

* test ci

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* remove test lines

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix retrieval loader

* add descriptive stats

* Add ScandiSent dataset (#2620)

* add scandisent dataset

* add to init

* typo

* lint

* 1.38.4

Automatically generated by python-semantic-release

* Format all citations (#2614)

* Fix errors in bibtex_citation

* Format all bibtex_citation fields

* format benchmarks

* fix format

* Fix tests

* add formatting script

* fix citations

* update imports

* fix citations

* fix citations

* format citation

---------

Co-authored-by: Isaac Chung <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: chenghao xiao <[email protected]>
Co-authored-by: Munot Ayush Sunil <[email protected]>
Co-authored-by: github-actions <[email protected]>
Co-authored-by: E. Tolga Ayan <[email protected]>
Co-authored-by: lllsy12138 <[email protected]>
Co-authored-by: shyuli <[email protected]>
Co-authored-by: Siddharth M. Bhatia <[email protected]>
Co-authored-by: Bao Loc Pham <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Flo <[email protected]>
Co-authored-by: Florian Rottach <[email protected]>
Co-authored-by: Alexey Vatolin <[email protected]>
Co-authored-by: Olesksii Horchynskyi <[email protected]>
Co-authored-by: Pandaswag <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Márton Kardos <[email protected]>
Co-authored-by: Mehrzad Shahin-Moghadam <[email protected]>
Co-authored-by: Mehrzad Shahin-Moghadam <[email protected]>
Co-authored-by: Youngjoon Jang <[email protected]>
Co-authored-by: 24September <[email protected]>
Co-authored-by: Jan Karaś <[email protected]>
Co-authored-by: Shuu <[email protected]>
Co-authored-by: namespace-Pt <[email protected]>
Co-authored-by: zhangpeitian <[email protected]>
isaac-chung added a commit that referenced this pull request May 3, 2025
* Update tasks table

* 1.36.26

Automatically generated by python-semantic-release

* Pass task name to all evaluators (#2389)

* pass task name to all tasks

* add test

* fix loader

* fix: renaming Zeroshot -> ZeroShot (#2395)

* fix: renaming Zeroshot -> ZeroShot

Adresses #2078

* rename 1

* rename 2

* format

* fixed error

* 1.36.27

Automatically generated by python-semantic-release

* fix: Update AmazonPolarityClassification license (#2402)

Update AmazonPolarityClassification.py

* fix b1ade name (#2403)

* 1.36.28

Automatically generated by python-semantic-release

* Minor style changes (#2396)

* fix: renaming Zeroshot -> ZeroShot

Adresses #2078

* fix: minor style changes

Adresses #2078

* rename 1

* rename 2

* format

* fixed error

---------

Co-authored-by: Isaac Chung <[email protected]>

* Added new dataset and tasks - ClusTREC-covid , clustering of thematic covid related scientific papers  (#2302)

* Clustrec covid new dataset and task

* fix

* fix

* fix

* fix

* fix

* descriptive stats

* change all mentions of clustrec-covidp2p to clustrec-covid

* change ' to "

* Update tasks table

* fix: Major updates to docs + make mieb dep optional (#2397)

* fix: renaming Zeroshot -> ZeroShot

Adresses #2078

* fix: minor style changes

Adresses #2078

* fix: Major updates to documentation

This PR does the following:
- This introduced other modalities more clearly in the documentation as well as make it easier to transition to a full on documentation site later.
- added minor code updates due to discovered inconsistencies in docs and code.
- Added the MMTEB citation where applicable
- makes the docs ready to move torchvision to an optional dependency

* Moved VISTA example

* rename 1

* rename 2

* format

* fixed error

* fix: make torchvision optional (#2399)

* fix: make torchvision optional

* format

* add docs

* minor fix

* remove transform from Any2TextMultipleChoiceEvaluator

---------

Co-authored-by: Isaac Chung <[email protected]>

* move Running SentenceTransformer model with prompts to usage

---------

Co-authored-by: Isaac Chung <[email protected]>

* 1.36.29

Automatically generated by python-semantic-release

* remove Arabic_Triplet_Matryoshka_V2.py (#2405)

* Min torchvision>0.2.1 (#2410)

matching torch>1.0.0

* fix: Add validation to model_name in `ModelMeta` (#2404)

* add test for name validation

* upd docs

* upd cohere name

* fix tests

* fix name for average_word_embeddings_komninos

* fix name for average_word_embeddings_komninos

* fix reranker test

* fix reranker test

* 1.36.30

Automatically generated by python-semantic-release

* [MIEB] "capability measured"-Abstask 1-1 matching refactor [1/3]: reimplement CV-Bench (#2414)

* refactor CV-Bench

* reimplement CV Bench

* remove abstask/evaluator/tests for Any2TextMultipleChoice

* rerun descriptive stats

* Update tasks table

* fix: Add option to remove benchmark from leaderboard (#2417)

fix: Add option to remove leaderboard from leaderboard

fixes #2413

This only removed the benchmark from the leaderboard but keep it in MTEB.

* 1.36.31

Automatically generated by python-semantic-release

* fix: Add VDR Multilingual Dataset (#2408)

* Added VDR Multilingual Dataset

* address comments

* make lint

* Formated Dataset for retrieval

* Update mteb/tasks/Retrieval/multilingual/VdrMultilingualRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/multilingual/VdrMultilingualRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* make lint

* corrected date

* fix dataset building

* move to image folder

---------

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Isaac Chung <[email protected]>

* Update tasks table

* 1.36.32

Automatically generated by python-semantic-release

* HOTFIX: pin setuptools (#2423)

* pin setuptools

* pin setuptools

* pin setuptools in makefile

* try ci

* fix ci

* remove speed from installs

* add __init__.py Clustering > kor folder,  And   edit __init__.py in Clustering folder (#2422)

* add PatentFnBClustering.py

* do make lint and revise

* rollback Makefile

* Update mteb/tasks/Clustering/kor/PatentFnBClustering.py

Co-authored-by: Roman Solomatin <[email protected]>

* klue_mrc_domain

* make lint

* klue_modified_clustering_dataset

* clustering & kor folder add __init.py

* clustering & kor folder add __init__.py

* task.py roll-back

* correct text_creation to sample_creation & delete form in MetaData

* correct task_subtype in TaskMetaData

* delete space

* edit metadata

* edit task_subtypes

---------

Co-authored-by: Roman Solomatin <[email protected]>

* Update tasks table

* Update speed dependencies with new setuptools release (#2429)

* add richinfoai models (#2427)

* add richinfoai models

add richinfoai models

* format codes by linter

format codes by linter

* Added Memory Usage column on leaderboard (#2428)

* docs: typos; Standardize spacing; Chronological order (#2436)

* Fix typos; add chrono order

* Fix spacing

* fix: Add model specific dependencies in pyproject.toml (#2424)

* Add model specific dependencies in pyproject.toml

* Update documentation

* 1.36.33

Automatically generated by python-semantic-release

* [MIEB] "capability measured"-Abstask 1-1 matching refactor [2/3]: reimplement r-Oxford and r-Paris (#2442)

* MutipleChoiceEvaluationMixin; reimplement r-Oxford and r-Paris; rerun stats

* modify benchmark list

* fix citation

* Update tasks table

* Error while evaluating MIRACLRetrievalHardNegatives: 'trust_remote_code' (#2445)

Fixes #2444

* Feat/searchmap preview (#2420)

* Added meta information about SearchMap_Preview model to the model_dir

* Added meta information about SearchMap_Preview model to the model_dir

* updated revision name

* Device loading and cuda cache cleaning step left out

* removed task instructions since it's not necessary

* changed sentence transformer loader to mteb default loader and passed instructions s model prompts

* Included searchmap to the models overview page

* Included searchmap to the models overview page

* added meta data information about where model was adpated from

* Update mteb/models/searchmap_models.py

* fix lint

* lint

---------

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* Add Background Gradients in Summary and Task Table (#2392)

* Add Background Gradients in Summary and Task Table

* Remove warnings and add light green cmap

* Address comments

* Separate styling function

* address comments

* added comments

* add ops_moa_models (#2439)

* add ops_moa_models

* add custom implementations

* Simplify custom implementation and format the code

* support SentenceTransformers

* add training datasets

* Update mteb/models/ops_moa_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* update training_datasets

---------

Co-authored-by: kunka.xgw <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* leaderboard fix (#2456)

* ci: cache `~/.cache/huggingface` (#2464)

ci: cache ~/.cache/huggingface

Co-authored-by: sam021313 <[email protected]>

* [MIEB] "capability measured"-Abstask 1-1 matching refactor [3/3]: reimplement ImageCoDe (#2468)

* reimplement ImageCoDe with ImageTextPairClassification

* add missing stats file

* Update tasks table

* fix: Adds family of NeuML/pubmedbert-base-embedding models (#2443)

* feat: added pubmedbert model2vec models

* fix: attribute model_name

* fix: fixed commit hash for pubmed_bert model2vec models

* fix: changes requested in PR 2443

* fix: add nb_sbert model (#2339)

* add_nb_sbert_model

* Update nb_sbert.py

added n_parameters and release_date

* Update mteb/models/nb_sbert.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update nb_sbert.py

fix: make lint

* added nb_sbert to overview.py + ran make lint

* Update nb_sbert.py

Fix error: Input should be a valid date or datetime, month value is outside expected range of 1-12

---------

Co-authored-by: Roman Solomatin <[email protected]>

* 1.36.34

Automatically generated by python-semantic-release

* suppress logging warnings on leaderboard (#2406)

* supress logging warnings

* remove loggers

* return blocks

* rename function

* fix gme models

* add server name

* update after merge

* fix ruff

* fix: E5 instruct now listed as sbert compatible (#2475)

Fixes #1442

* 1.36.35

Automatically generated by python-semantic-release

* [MIEB] rename VisionCentric to VisionCentricQA (#2479)

rename VisionCentric to VisionCentricQA

* ci: Run dataset loading only when pushing to main (#2480)

Update dataset_loading.yml

* fix table in tasks.md (#2483)

* Update tasks table

* fix: add prompt to NanoDBPedia (#2486)

* 1.36.36

Automatically generated by python-semantic-release

* Fix Task Lang Table (#2487)

* Fix Task Lang Table

* added tasks.md

* fix

* fix: Ignore datasets not available in tests (#2484)

* 1.36.37

Automatically generated by python-semantic-release

* [MIEB] align main metrics with leaderboard (#2489)

align main metrics with leaderboard

* typo in model name (#2491)

* SpeedTask add deprecated warning (#2493)

* Docs: Update README.md (#2494)

Update README.md

* fix transformers version for now (#2504)

* Fix typos (#2509)

* ci: refactor TaskMetadata eval langs test (#2501)

* refactor eval langs test

* function returns None

* add hard negaties tasks in _HISTORIC_DATASETS

* rename to ImageClustering folder (#2516)

rename folder

* Clean up trailing spaces citation (#2518)

* rename folder

* trailing spaces

* missed one

* [mieb] Memotion preprocessing code made more robust and readable (#2519)

* fix: validate lang code in ModelMeta (#2499)

* Update pyproject.toml (#2522)

* 1.36.38

Automatically generated by python-semantic-release

* Fix leaderboard version (#2524)

* fix gradio leaderboard run

* update docs

* Fix gte-multilingual-base embed_dim (#2526)

* [MIEB] Specify only the multilingual AggTask for MIEB-lite (#2539)

specify only the multilingual AggTask

* [mieb] fix hatefulmemes (#2531)

* fix hatefulmeme

* add to description and use polars instead

---------

Co-authored-by: Isaac Chung <[email protected]>

* Model conan (#2534)

* conan_models

* conan_models

* refactor code

* refactor code

---------

Co-authored-by: shyuli <[email protected]>

* fix: Update mteb.get_tasks with an exclude_aggregate parameter to exclude aggregate tasks (#2536)

* Implement task.is_aggregate check

* Add `mteb.get_tasks` parameter `include_aggregate` to exclude aggregate tasks if needed

* Update mteb.run with the new `task.is_aggregate` parameter

* Add tests

* Ran linter

* Changed logic to `exclude_aggregate`

* Updated from review comments

* Exclude aggregate by default false in get_tasks

* 1.36.39

Automatically generated by python-semantic-release

* docs: Add MIEB citation in benchmarks (#2544)

Add MIEB citation in benchmarks

* Add 2 new Vietnamese Retrieval Datasets (#2393)

* [ADD] 2 new Datasets

* [UPDATE] Change bibtext_citation for GreenNodeTableMarkdownRetrieval as TODO

* [UPDATE] Change bibtext_citation for ZacLegalTextRetrieval as TODO

* Update tasks table

* fix: CacheWrapper per task (#2467)

* feat: CacheWrapper per task

* refactor logic

* update documentation

---------

Co-authored-by: Florian Rottach <[email protected]>

* 1.36.40

Automatically generated by python-semantic-release

* misc: move MMTEB scripts and notebooks to separate repo (#2546)

move mmteb scripts and notebooks to separate repo

* fix: Update requirements in JinaWrapper (#2548)

fix: Update package requirements in JinaWrapper for einops and flash_attn

* 1.36.41

Automatically generated by python-semantic-release

* Docs: Add MIEB to README (#2550)

Add MIEB to README

* Add xlm_roberta_ua_distilled (#2547)

* defined model metadata for xlm_roberta_ua_distilled

* Update mteb/models/ua_sentence_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* included ua_sentence_models.py in overview.py

* applied linting, added missing fields in ModelMeta

* applied linting

---------

Co-authored-by: Roman Solomatin <[email protected]>

* fix me5 trainind data config to include xquad dataset (#2552)

* fix: me5 trainind data config to include xquad dataset

* Update mteb/models/e5_models.py

upddate: xquad key name

Co-authored-by: Roman Solomatin <[email protected]>

* fix: ME5_TRAINING_DATA format

---------

Co-authored-by: Roman Solomatin <[email protected]>

* feat: Added dataframe utilities to BenchmarkResults (#2542)

* fix: Added dataframe utilities to BenchmarkResults

- Added `get_results_table`. I was considering renaming it to `to_dataframe` to align with `tasks.to_dataframe`. WDYT?
- Added a tests for ModelResults and BenchmarksResults
- Added a few utility functions where needed
- Added docstring throughout ModelResults and BenchmarksResults
- Added todo comment for missing aspects - mostly v2 - but we join_revisions seems like it could use an update before then.

Prerequisite for #2454:

@ayush1298 can I ask you to review this PR as well? I hope this give an idea of what I was hinting at. Sorry that it took a while. I wanted to make sure to get it right.

* refactor to to_dataframe and combine common dependencies

* ibid

* fix revision joining after discussion with @x-tabdeveloping

* remove strict=True for zip() as it is a >3.9 feature

* updated mock cache

* 1.37.0

Automatically generated by python-semantic-release

* fix e5_R_mistral_7b (#2490)

* fix e5_R_mistral_7b

* change wrapper

* address comments

* Added kwargs for pad_token

* correct lang format

* address comments

* add revision

---------

Co-authored-by: Roman Solomatin <[email protected]>

* fix unintentional working of filters on leaderboard (#2535)

* fix unintentional working of filters on leaderboard

* address comments

* make lint

* address comments

* rollback unnecessary changes

* feat: UI Overhaul (#2549)

* Bumped gradio version to latest

* Added new Gradio table functionality to leaderboard

* Removed search bar

* Changed color scheme in plot to match the table

* Added new benchmark selector in sidebar

* Changed not activated button type to secondary

* Short-circuited callbacks that are based on language selection

* Re-added column width calculation since it got messed up

* Commented out gradient for per-task table as it slowed things down substantially

* Styling and layout updates

* Adjusted comments according to reviews

* Converted all print statements to logger.debug

* Removed pydantic version fix

* Ran linting

* Remove commented out code

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Moved English,v1 to Legacy section

* Closed the benchmark sharing accordion by default

* Adjusted markdown blocks according to suggestions

* Ran linter

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* 1.38.0

Automatically generated by python-semantic-release

* add USER2 (#2560)

* add user2

* add training code

* update prompts

* Fix leaderboard entry for BuiltBench (#2563)

Fix leaderboard entry for BuiltBench (#2562)

Co-authored-by: Mehrzad Shahin-Moghadam <[email protected]>

* fix: jasper models embeddings having nan values (#2481)

* 1.38.1

Automatically generated by python-semantic-release

* fix frida datasets (#2565)

* Add relle (#2564)

* Add relle
* defined model metadata for relle

* Add mteb/models/relle_models.py

* Update mteb/models/relle_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* lint after commit

run after "make lint"

* Add into model_modules

Add model into model_modules and lint check

---------

Co-authored-by: Roman Solomatin <[email protected]>

* Backfill task metadata for metadata for GermanDPR and GermanQuAD (#2566)

* Add metadata for GermanDPR and GermanQuAD

* PR improvements

* Update tasks table

* Add  ModelMeta for CodeSearch-ModernBERT-Crow-Plus (#2570)

* Add files via upload

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update overview.py

* Update shuu_model.py

* Update shuu_model.py

* Update shuu_model.py

* Update mteb/models/shuu_model.py

Co-authored-by: Roman Solomatin <[email protected]>

---------

Co-authored-by: Roman Solomatin <[email protected]>

* Docs: Improve MIEB docs (#2569)

* Add missing annotations (#2498)

* Update tasks table

* move icon & name to benchmark dataclass (#2573)

* Remove the comments from ImageEncoder (#2579)

* fix: Add Encodechka benchmark (#2561)

* add tasks

* add benchmark

* fix imports

* update stsb split

* Update tasks table

* 1.38.2

Automatically generated by python-semantic-release

* fix FlagEmbedding package name (#2588)

* fix codecarbon version (#2587)

* Add MIEB image only benchmark (#2590)

* add vision only bench

* add description

* correct zs task modalities

* specify tasks param

* Add image only MIEB benchmark to LB left panel (#2596)

* Update benchmarks.py

* make lint

* add to left side bar

* update Doubao-1.5-Embedding (#2575)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* fix: Add WebSSL models (#2604)

* add 2 web SSL dino models

* add models from collection and revisions

* update memory_usage_mb and embed dim

* use automodel instead

* fix mieb citation (#2606)

* 1.38.3

Automatically generated by python-semantic-release

* Update Doubao-1.5-Embedding (#2611)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* CI: update benchmark table (#2609)

* update benchmark table

* fix table

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update Doubao-1.5-Embedding revision (#2613)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

* update revision

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* CI: fix table  (#2615)

* Update tasks & benchmarks tables

* Update gradio version (#2558)

* Update gradio version

Closes #2557

* bump gradio

* fix: Removed missing dataset for MTEB(Multilingual) and bumped version

We should probably just have done this earlier to ensure that the multilingual benchamrk is runable.

* CI: fix infinitely committing issue (#2616)

* fix token

* try to trigger

* add token

* test ci

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* remove test lines

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Add ScandiSent dataset (#2620)

* add scandisent dataset

* add to init

* typo

* lint

* 1.38.4

Automatically generated by python-semantic-release

* Format all citations (#2614)

* Fix errors in bibtex_citation

* Format all bibtex_citation fields

* format benchmarks

* fix format

* Fix tests

* add formatting script

* fix citations (#2628)

* Add Talemaader pair classification task (#2621)

Add talemaader pair classification task

* fix citations

* fix citations

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Uri K <[email protected]>
Co-authored-by: chenghao xiao <[email protected]>
Co-authored-by: Munot Ayush Sunil <[email protected]>
Co-authored-by: OnandOn <[email protected]>
Co-authored-by: richinfo-ai <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: Adewole Babatunde <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: ahxgw <[email protected]>
Co-authored-by: kunka.xgw <[email protected]>
Co-authored-by: Sam Heymann <[email protected]>
Co-authored-by: sam021313 <[email protected]>
Co-authored-by: Nadia Sheikh <[email protected]>
Co-authored-by: theatollersrud <[email protected]>
Co-authored-by: hongst <[email protected]>
Co-authored-by: E. Tolga Ayan <[email protected]>
Co-authored-by: lllsy12138 <[email protected]>
Co-authored-by: shyuli <[email protected]>
Co-authored-by: Siddharth M. Bhatia <[email protected]>
Co-authored-by: Bao Loc Pham <[email protected]>
Co-authored-by: Flo <[email protected]>
Co-authored-by: Florian Rottach <[email protected]>
Co-authored-by: Alexey Vatolin <[email protected]>
Co-authored-by: Olesksii Horchynskyi <[email protected]>
Co-authored-by: Pandaswag <[email protected]>
Co-authored-by: Márton Kardos <[email protected]>
Co-authored-by: Mehrzad Shahin-Moghadam <[email protected]>
Co-authored-by: Mehrzad Shahin-Moghadam <[email protected]>
Co-authored-by: Youngjoon Jang <[email protected]>
Co-authored-by: 24September <[email protected]>
Co-authored-by: Jan Karaś <[email protected]>
Co-authored-by: Shuu <[email protected]>
Co-authored-by: namespace-Pt <[email protected]>
Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Imene Kerboua <[email protected]>
isaac-chung added a commit that referenced this pull request Jun 22, 2025
* move icon & name to benchmark dataclass (#2573)

* Remove the comments from ImageEncoder (#2579)

* fix: Add Encodechka benchmark (#2561)

* add tasks

* add benchmark

* fix imports

* update stsb split

* Update tasks table

* 1.38.2

Automatically generated by python-semantic-release

* fix FlagEmbedding package name (#2588)

* fix codecarbon version (#2587)

* Add MIEB image only benchmark (#2590)

* add vision only bench

* add description

* correct zs task modalities

* specify tasks param

* Add image only MIEB benchmark to LB left panel (#2596)

* Update benchmarks.py

* make lint

* add to left side bar

* update Doubao-1.5-Embedding (#2575)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* fix: Add WebSSL models (#2604)

* add 2 web SSL dino models

* add models from collection and revisions

* update memory_usage_mb and embed dim

* use automodel instead

* fix mieb citation (#2606)

* 1.38.3

Automatically generated by python-semantic-release

* Update Doubao-1.5-Embedding (#2611)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* CI: update benchmark table (#2609)

* update benchmark table

* fix table

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update Doubao-1.5-Embedding revision (#2613)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

* update revision

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* CI: fix table  (#2615)

* Update tasks & benchmarks tables

* Update gradio version (#2558)

* Update gradio version

Closes #2557

* bump gradio

* fix: Removed missing dataset for MTEB(Multilingual) and bumped version

We should probably just have done this earlier to ensure that the multilingual benchamrk is runable.

* CI: fix infinitely committing issue (#2616)

* fix token

* try to trigger

* add token

* test ci

* Update tasks & benchmarks tables

* Update tasks & benchmarks tables

* remove test lines

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Add ScandiSent dataset (#2620)

* add scandisent dataset

* add to init

* typo

* lint

* 1.38.4

Automatically generated by python-semantic-release

* Format all citations (#2614)

* Fix errors in bibtex_citation

* Format all bibtex_citation fields

* format benchmarks

* fix format

* Fix tests

* add formatting script

* fix citations (#2628)

* Add Talemaader pair classification task (#2621)

Add talemaader pair classification task

* add Bilingual English-Danish parallel corpus from The Danish Medicines Agency (#2633)

* add Bilingual English-Danish parallel corpus from The Danish Medicines Agency

* bump dataset revision

* format bibtex

* format bibtex

* Remove irrelevant test (#2630)

remove irrelevant test

* Revert "CI: fix infinitely committing issue (#2616)" (#2636)

This reverts commit 82dcb3d.

* Update tasks & benchmarks tables

* Remove `typer` dependency from citation script (#2629)

remove typer dependency from citation script

* CI format citations (#2649)

* ci format citations

* add files

* remove from lint CI

* test lint

* test lint

* fix names

* fix: Update VisualSTS Aggregate task modalities (#2597)

* Update STS17MultilingualVisualSTS.py

* fix STSBenchmarkMultilingualVisualSTS

---------

Co-authored-by: Isaac Chung <[email protected]>

* 1.38.5

Automatically generated by python-semantic-release

* Add tests for leaderboard build (#2631)

* Add tests for leaderboard build

* add new action

* remove build tests from other actions

* fix tests

* correct exclusion of test

* added timeout constant

* fix: SIB200 machine translated > human translated (#2665)

As correctly pointed out in:

https://huggingface.co/datasets/mteb/sib200/discussions/1

* 1.38.6

Automatically generated by python-semantic-release

* fix: Update datasets wich can't be loaded with `datasets>=3.0`  (#2661)

fix: Update datasets wich can't be loaded with `datasets>=3.0` (#1619)

* reupload datasets

* fix loader

* remove commented code

* lint

* update pyproject dependencies

* rename model RELLE to CHAIN19 (#2671)

* Add relle
* defined model metadata for relle

* Add mteb/models/relle_models.py

* Update mteb/models/relle_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* lint after commit

run after "make lint"

* Add into model_modules

Add model into model_modules and lint check

* rename model
change model name

* rename model
change model name

---------

Co-authored-by: Roman Solomatin <[email protected]>

* 1.38.7

Automatically generated by python-semantic-release

* Update final version of Doubao-1.5-Embedding (Rename to Seed1.5-Embedding) (#2674)

* update seed-embedding

* update seed models

* fix linting and tiktoken problem

* fix tiktoken bug

* fix lint

* update name

* Update mteb/models/seed_models.py

adopt suggestion

Co-authored-by: Roman Solomatin <[email protected]>

* update logging

* update lint

* update link

* update revision

* update Doubao-1.5-Embedding revision 3

* rename Doubao-1.5-Embedding to Seed1.5-Embedding

---------

Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* fix: Allow empty string for openai models (#2676)

* fix for empty string input to openai/text-embedding-3-large

* fix: Allow empty string in openai models

closes: #1650

* fix based on review

* Updated docstring

---------

Co-authored-by: ayush1298 <[email protected]>

* 1.38.8

Automatically generated by python-semantic-release

* Leaderboard: UI simplifications for menus (#2672)

* Leaderboard: UI simplifications for menus

Did a few things to improve the simplify the leaderboard UI.

Changes:
- Combined FAQ entries
- Created dropdowns in the select benchmark menu sidebar
- Removed reference to arena
- Removed reference to old leaderboard
- reduced size of select menu
- reduced the size of acknowledgements
- removed farsi from the selection (as it is a beta)

refactors:
- refactored to use a class for menu items
- refactored texts segments out of app.py

* fixed comment

* fixes for sizes

* fix modality for `OVENIT2TRetrieval` (#2678)

fix modality

* fix: `MTEB(Code, v1)`  languages (#2679)

fix code languages

* 1.38.9

Automatically generated by python-semantic-release

* Correction in docs (#2688)

* Fix for Openai_Text-Embedding3-Small (#2702)

* Fix for Openai_Text-Embedding3-Small

* better syntax for readability

* Fix for Openai_Text-Embedding3-Small (#2702)

* Fix for Openai_Text-Embedding3-Small

* better syntax for readability

* fix: Ensure that optional dependencies are compatible and if not state it (#2706)

Fixes mistakes introduced in #2424

It seems like many of these requirements doesn't exist (voyageai>=1.0.0). @ayush1298 I am hoping you could clear up how this happened?

* fix: Only install mteb into site packages (#2618)

* Restrict installation directory

* fix

* namespace false

* add star

* add pont

* fix import

* fix import

* add init files

* fix setuptools find

* fix image init

* add missing templates

---------

Co-authored-by: Roman Solomatin <[email protected]>

* 1.38.10

Automatically generated by python-semantic-release

* docs: Updated the PR template and improved submission docs (#2704)

* docs: Updated the PR template and improved submission docs

1) Updated PR template to only include checklist for datasets and models. The other checklists were essentially just tests.
2) I have updated the documentation for adding models. Notably I have split out the implementation segment, which I think makes it more readable.
3) Required that you argue for a dataset before addition

fixes #2568

* Apply suggestions from code review

Co-authored-by: Isaac Chung <[email protected]>

---------

Co-authored-by: Isaac Chung <[email protected]>

* fix: Remove models from the leaderboard (#2705)

* fix: Remove models from the leaderboard

I remove both models from the leaderboard by unlinking them from the import tree. I think this is the easiest way to add a model that not currently public.

* format

* 1.38.11

Automatically generated by python-semantic-release

* fix: Rename gemini-embedding-exp-03-07 to gemini-embedding-001 (#2711)

* Rename gemini-embedding-exp-03-07 to gemini-embedding-001

* update referenfe link to the vertexAI API doc

* 1.38.12

Automatically generated by python-semantic-release

* fix: Integrate `lightonai/GTE-ModernColBERT-v1` (#2708)

* fix: Integrate `lightonai/GTE-ModernColBERT-v1`

Fixes #2673

* fixes based on corrections

* 1.38.13

Automatically generated by python-semantic-release

* docs: fix number of tasks for eng, v2 in docs (#2720)

* fix: Added potion-multilingual-128M (#2717)

* Added ModelMeta for potion-multilingual-128M

* Fixed linting

* Fixed linting

* Updated date

* 1.38.14

Automatically generated by python-semantic-release

* Update the max tokens for gemini-embedding-001 (#2725)

* fix: Ara and ben classification dataset cleaning (#2632)

* Improve classification datasets quality for ara and ben langs

* add missing AJGT

* fix format

* change ajgt description

* Fix numbers in description, add link to pull request

* Add too short filter

* Link in markdown format

* Update tasks & benchmarks tables

* fix: Update Seed1.5-Embedding API (#2724)

* update seed1.5-embedding api

* update seed1.5-embedding api

* update Seed1.5-Embedding API

* update Seed1.5-Embedding resolve comments

* update Seed1.5-Embedding lint

* Update mteb/models/seed_models.py

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* 1.38.15

Automatically generated by python-semantic-release

* fix: Add vidore v2 benchmarks (#2713)

* adding vidore benchmarks

* fix typo

* clean vidore names + per lang eval

* lint

* vidore names

* bibtex fix

* fix revision

* vidore v2 citation

* update citation format and fix per-language mappings

* lint: citations

* typo citations

* Update tasks & benchmarks tables

* 1.38.16

Automatically generated by python-semantic-release

* fix: `IndicQARetrieval` loader (#2729)

* fix indic qa

* add kwargs

* 1.38.17

Automatically generated by python-semantic-release

* fix: Promote Persian benchmark to v1 (#2707)

* Switch versioning from beta to v1 and add v1 to benchmark selector

* Update Farsi benchmark display name, task IDs, and metadata

* Add Hakim Model

* fix hakim version

* update

* make lint

* fix: Promote Persian benchmark to v1

---------

Co-authored-by: mehran <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Update tasks & benchmarks tables

* 1.38.18

Automatically generated by python-semantic-release

* Add ViDoRe combined benchmark and add to leaderboard side panel (#2732)

* add ViDoRe combined benchmark and add to leaderboard side panel

* Update benchmark_selector.py

* Update tasks & benchmarks tables

* fix: Rename display name of VDR (#2734)

* Update tasks & benchmarks tables

* 1.38.19

Automatically generated by python-semantic-release

* fix: Add colpali models family (#2721)

* add colpali models

* add colpali as framework

* add colpali as framework

* update metadata and add colsmol

* ix typos

* account for revision

* add training data info and lint

* modify meta

* correct colmodels meta and add colnomic 7b

* fix typo in toml (colpali subdeps)

* refine colmodel loading and metadata

* 1.38.20

Automatically generated by python-semantic-release

* fix: Correct embedding dimension for bge-m3 (#2738)

Fixes #2735

* 1.38.21

Automatically generated by python-semantic-release

* docs: Updated description of FEVER (#2745)

* docs: Updated description of FEVER

Update the description to state that the corpus is the same as fever as we have have [multiple questions on it](https://huggingface.co/datasets/mteb/climate-fever/discussions/2)

* minor

* Backfill task metadata for metadata for BigPatentClustering and AllegroReviews (#2755)

* big-patent

* allegro-reviews

* Update tasks & benchmarks tables

* Update Seed1.5 training data (#2749)

* update seed1.5 training data

* update seed1.5 training data

* fix: Update caltech101 (#2759)

* docs: Updated description of FEVER

Update the description to state that the corpus is the same as fever as we have have [multiple questions on it](https://huggingface.co/datasets/mteb/climate-fever/discussions/2)

* fix: Update Caltech101 to different source

Run both versions of one of the task using `nomic-ai/nomic-embed-text-v1.5` and both scores match:

### Old

```
{
  "dataset_revision": "851374102055782c84f89b1b4e9d128a6568847b",
  "task_name": "Caltech101",
  "mteb_version": "1.38.4",
  "scores": {
    "test": [
      {
        "accuracy": 0.897863,
```

### New
```
{
  "dataset_revision": "52439cf6d4f6ebf563d8cdc7f2c5371d9efd2686",
  "task_name": "Caltech101",
  "mteb_version": "1.38.4",
  "scores": {
    "test": [
      {
        "accuracy": 0.897929,
```

* 1.38.22

Automatically generated by python-semantic-release

* Add missing PatchCamelyon_labels.txt (#2756)

* ci: Delete cache in Model loading test only when model is loaded (#2761)

* only delete cache when model loaded

* testing it out

* fix: Add `cadet-embed-base-v1` (#2727)

* update

* update overview.py for models

* update

* update

* 1.38.23

Automatically generated by python-semantic-release

* Fixing Google embedding task type for STS (#2767)

The type `SIMILARITY` is invalid. Correct one: `SEMANTIC_SIMILARITY`. See https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/task-types#supported_task_types

* docs: Leaderboard simplifications (#2764)

* docs: Leaderboard simplifications

Simplified sidebar, notably:

1) Combined Language and Regional (since these are all languages)
2) Folded all (With Visual document retrieval then images start to take up a lot of space)
3) Removed legacy and instead added "Other" in language, where I moved "English Legacy"

I also restructured the code so that nesting is easier.

Is it also possible to create a seperate section (see dummy screenshot)

* refactor to reduce nesting

* format

* fix: add xet support (#2603)

* add xet version

* add doc comment

* change xet requirements

* Update docs/usage/usage.md

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* 1.38.24

Automatically generated by python-semantic-release

* fix: Update giga embeddings (#2774)

* update giga embeddings

* update giga embeddings

---------

Co-authored-by: Kolodin Egor <[email protected]>

* ci: add new prefixes to releases (#2766)

add new prefixes

* 1.38.25

Automatically generated by python-semantic-release

* fix: Update Caltech101 datasets to latest revision [v1] (#2778)

* fix: Update Caltech101 datasets to latest revision [v2]

 fixes: #2770
Fixes the issue, but only in v1

```
# tested using:

task: mteb.AbsTask = mteb.get_task("Caltech101ZeroShot")
task.load_data()
task.get_candidate_labels()
```

* fix rev

* 1.38.26

Automatically generated by python-semantic-release

* fix: CachedEmbeddingWrapper issues in both documentation and code (#2779)

Fixes #2772

* 1.38.27

Automatically generated by python-semantic-release

* dataset: Add miracl vision (#2736)

* add miracl vision

* add miracl vision

* ruff

* cast

* image

* image

* add langs

* add langs

* add langs

* add langs

* descriptive stats

* lint

* lint

* lint

* remove com

* Update tasks & benchmarks tables

* model: Add Qwen3 Embedding model (#2769)

* Init code

* Remove extra config and lint code

* use sentence transformer

* add revisions

* fix lint

* Apply suggestions from code review

Co-authored-by: Roman Solomatin <[email protected]>

* fix lint

* add framework

---------

Co-authored-by: Roman Solomatin <[email protected]>

* bump ruff (#2784)

* Update issue and pr templates (#2782)

* Update issue templates

* Update bug_report.md

* test yaml template

* add templates

* update templates

* add emojis

* fix typo

* Apply suggestions from code review

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* update issue titles

* update PR template

* remove PR templates

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* model: Add GeoGPT-Research-Project/GeoEmbedding (#2773)

* add model: geogpt_models

* update geogpt_models

* use InstructSentenceTransformerWrapper

* resolve pylint warning

* format geogpt_models.py

* Update mteb/models/geogpt_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/models/geogpt_models.py

---------

Co-authored-by: zhangzeqing <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* model: add fangxq/XYZ-embedding (#2741)

* add xyz model

* add xyz model

* add xyz model

* update

* update

* update

* update

* update

* update

* update

* lint

---------

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>

* ci: fix config error for semantic release (#2800)

discussed in: #2796

* dataset: Add R2MED Benchmark (#2795)

* Add files via upload

* Add files via upload

* Update benchmarks.py

* Update __init__.py

* Add files via upload

* Update R2MEDRetrieval.py

* Update run_mteb_r2med.py

* Delete scripts/run_mteb_r2med.py

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/tasks/Retrieval/eng/R2MEDRetrieval.py

Co-authored-by: Roman Solomatin <[email protected]>

* Add files via upload

* Delete mteb/descriptive_stats/Retrieval/R2MEDRetrieval.json

* Add files via upload

* Add files via upload

* Add files via upload

* Update R2MEDRetrieval.py

* Add files via upload

* Add files via upload

* Add files via upload

* Add files via upload

* format citations

* Update R2MEDRetrieval.py

* Add files via upload

* Add files via upload

---------

Co-authored-by: Li Lei <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* Update tasks & benchmarks tables

* Update training datasets of GeoGPT-Research-Project/GeoEmbedding (#2802)

update training datasets

Co-authored-by: zhangzeqing <[email protected]>

* fix: Add adapted_from to Cmedqaretrieval (#2806)

* fix: Add adapted_from to Cmedqaretrieval

Also snuck in a fix with form=None, which is no longer valid, but was still used in a few places.

* format

* 1.38.28

Automatically generated by python-semantic-release

* fix: Adding client arg to init method of OpenAI models wrapper (#2803)

* Adding OpenAI client arg to init method (e.g., for already initialized AzureOpenAI client)

To use OpenAI embedding models via Azure, the model wrapper needs to be initialized with a different client.

* Update mteb/models/openai_models.py

Co-authored-by: Roman Solomatin <[email protected]>

* Update mteb/models/openai_models.py

* remove comment and format

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* model: Add annamodels/LGAI-Embedding-Preview (#2810)

Add LGAI-Embedding

- Add mteb/models/lgai_embedding_models.py

- defined model metadata

* fix: Ensure bright uses the correct revision (#2812)

fixes #2811

* 1.38.29

Automatically generated by python-semantic-release

* add description to issue template (#2817)

* add description to template

* fix typo

* model: Added 3 HIT-TMG's KaLM-embedding models (#2478)

* Added HIT-TMG_KaLM-embedding-multilingual-mini-instruct-v1 with instruct wrapper

* Added KaLM_embedding_multilingual_mini_instruct_v1_5

* Added model to overview.py

* Fix Task Count Per Language Table in tasks.md

* resolve conflicts

* remove tasks.md

* Modified get_instruction funcion

* Added support for prompt dict in get_instruction

* fix lang code

* Address comments

* Delete mteb/models/check_models.py

* added prompts_dict support in InstructSentenceTransformerWrapper

* corrected instruction format

* corrected prompts format

* added correct instruction format

* fix implementation

* remove `if name main`

* add comment

---------

Co-authored-by: Roman Solomatin <[email protected]>

* fix: Reuploaded previously unavailable SNL datasets (#2819)

* fix: Reuploaded previously unavailable SNL datasets

closes #2477

* removed exceptions from tests

* temp fixes

* added temporary fix

* clean up commented out code

* format

* Update tasks & benchmarks tables

* 1.38.30

Automatically generated by python-semantic-release

* docs: Fix some typos in `docs/usage/usage.md` (#2835)

* Update usage.md

* Update usage.md

* Update docs/usage/usage.md

---------

Co-authored-by: Isaac Chung <[email protected]>

* model: Add custom instructions for GigaEmbeddings (#2836)

* add custom instructions

* fixed

* lint

* fix last instruction

---------

Co-authored-by: Kolodin Egor <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>

* try adding init

* add init in audio pc task eng

* all audio tasks init

* remove script test

---------

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions <[email protected]>
Co-authored-by: namespace-Pt <[email protected]>
Co-authored-by: zhangpeitian <[email protected]>
Co-authored-by: Alexey Vatolin <[email protected]>
Co-authored-by: Imene Kerboua <[email protected]>
Co-authored-by: Ömer Veysel Çağatan <[email protected]>
Co-authored-by: Munot Ayush Sunil <[email protected]>
Co-authored-by: 24September <[email protected]>
Co-authored-by: wang.yuqi <[email protected]>
Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Feiyang <[email protected]>
Co-authored-by: Thomas van Dongen <[email protected]>
Co-authored-by: Paul Teiletche <[email protected]>
Co-authored-by: Mehran Sarmadi <[email protected]>
Co-authored-by: mehran <[email protected]>
Co-authored-by: Dawid Koterwas <[email protected]>
Co-authored-by: Wentao Wu <[email protected]>
Co-authored-by: Manveer Tamber <[email protected]>
Co-authored-by: malteos <[email protected]>
Co-authored-by: Egor <[email protected]>
Co-authored-by: Kolodin Egor <[email protected]>
Co-authored-by: Manuel Faysse <[email protected]>
Co-authored-by: Xin Zhang <[email protected]>
Co-authored-by: Hypothesis-Z <[email protected]>
Co-authored-by: zhangzeqing <[email protected]>
Co-authored-by: fangxiaoquan <[email protected]>
Co-authored-by: Li Lei <[email protected]>
Co-authored-by: annamodels <[email protected]>
Co-authored-by: Sadra Barikbin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new benchmark Issues related to adding a new benchmark
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants