Releases · Lightning-AI/torchmetrics

23 Jul 17:33

Borda

v1.8.0

329a7ec

Latest

The upcoming TorchMetrics v1.8.0 release introduces three flagship metrics, each designed to address critical evaluation needs in real-world applications.

Video Multi-Method Assessment Fusion (VMAF) brings a perceptual video-quality score that closely mirrors human judgment, powering streaming services such as Netflix and YouTube to optimize encoding ladders for consistent viewer experiences and enabling video-restoration labs to quantify improvements achieved by denoising and super-resolution algorithms.

Continuous Ranked Probability Score (CRPS) enables comprehensive evaluation of full predictive distributions rather than point estimates; meteorological centers leverage CRPS to benchmark probabilistic precipitation and temperature forecasts, improving public weather alerts, while energy companies apply it to assess uncertainty in load-demand predictions and refine grid management and trading strategies.

Lip Vertex Error (LVE) measures the discrepancy between predicted and ground-truth lip landmarks to quantify audio-visual synchronization. Localization studios use LVE to validate lip-sync accuracy during film dubbing, while AR/VR developers integrate it into avatar pipelines to ensure natural mouth movements in real-time virtual meetings and social experiences.

[1.8.0] - 2025-07-23

Added

Added VMAF metric to new video domain (#2991)
Added CRPS in regression domain (#3024)
Added aggregation_level argument to DiceScore (#3018)
Added support for reduction="none" to LearnedPerceptualImagePatchSimilarity (#3053)
Added support single str input for functional interface of bert_score (#3056)
Enhance: BERTScore to evaluate hypotheses against multiple references (#3069)
Added Lip Vertex Error (LVE) in multimodal domain (#3090)
Added antialias argument to FID metric (#3177)
Added mixed input format to segmentation metrics (#3176)

Changed

Changed data_range argument in PSNR metric to be a required argument (#3178)

Removed

Removed zero_division argument from DiceScore (#3018)

Key Contributors

@nkaenzig, @rittik9, @simonreise, @SkafteNicki

New Contributors

@lantiga made their first contribution in #3054
@AlexVerine made their first contribution in #3057
@ZhiyuanChen made their first contribution in #3059
@ahmedhshahin made their first contribution in #3101
@gratus907 made their first contribution in #3103
@cyyever made their first contribution in #3118
@Armannas made their first contribution in #3124
@alifa98 made their first contribution in #3128
@simonreise made their first contribution in #3176

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v1.7.0...v1.8.0

Contributors

lantiga, Armannas, and 10 other contributors

Assets 4

05 Jul 12:22

Borda

v1.7.4

986eb90

Minor patch release

[1.7.4] - 2025-07-04

Changed

Improved numerical stability of pearson's correlation coefficient (#3152)

Fixed

Fixed: Ignore zero and negative predictions in retrieval metrics (#3160)
Fixed SSIM dist_reduce_fx when reduction=None for distributed training (#3162, #3166)
Fixed attribute error (#3154)
Fixed incorrect shape in _pearson_corrcoef_update (#3168)

Key Contributors

@AymenKallala, @gratus907, @Isalia20, @rittik9

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v1.7.3...v1.7.4

Contributors

gratus907, Isalia20, and 2 other contributors

Assets 4

13 Jun 15:33

Borda

v1.7.3

6e1b354

Minor patch release

[1.7.3] - 2025-06-13

Fixed

Fixed: ensure WrapperMetric resets wrapped_metric state (#3123)
Fixed top_k in multiclass_accuracy (#3117)
Fixed compatibility to COCO format for pycocotools 2.0.10 (#3131)

Key Contributors

@rittik9

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v1.7.2...v1.7.3

Contributors

rittik9

Assets 4

28 May 20:20

Borda

v1.7.2

6587b6a

Minor patch release

[1.7.2] - 2025-05-27

Changed

Enhance: improve performance of _rank_data (#3103)

Fixed

Fixed UnboundLocalError in MatthewsCorrCoef (#3059)
Fixed MIFID incorrectly converts inputs to byte dtype with custom encoders (#3064)
Fixed ignore_index in MultilabelExactMatch (#3085)
Fixed: disable non-blocking on MPS (#3101)

Key Contributors

@ahmedhshahin, @gratus907, @rittik9, @ZhiyuanChen

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v1.7.1...v1.7.2

Contributors

gratus907, ZhiyuanChen, and 2 other contributors

Assets 4

07 Apr 19:33

Borda

v1.7.1

41aaba3

Minor patch release

[1.7.1] - 2025-04-06

Changed

Enhance Support Adding a MetricCollection to Another MetricCollection in add_metrics Function (#3032)

Fixed

Fixed absent class MeanIOU (#2892)
Fixed detection IoU ignores predictions without ground truth (#3025)
Fixed error raised in MulticlassAccuracy when top_k>1 (#3039)

Key Contributors

@Isalia20, @rittik9, @SkafteNicki

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v1.7.0...v1.7.1

Contributors

SkafteNicki, Isalia20, and rittik9

Assets 4

20 Mar 19:05

Borda

v1.7.0

3fe3aa5

More image metrics

The upcoming release of TorchMetrics is set to deliver a range of innovative features and enhancements across multiple domains, further solidifying its position as a leading tool for machine learning metrics. In the image domain, significant additions include the ARNIQA and DeepImageStructureAndTextureSimilarity metrics, which provide new insights into image quality and similarity. Additionally, the CLIPScore metric now supports more models and processors, expanding its versatility in image-text alignment tasks.

Beyond image analysis, the regression package welcomes the JensenShannonDivergence metric, offering a powerful tool for comparing probability distributions. The clustering package also sees a notable update with the introduction of the ClusterAccuracy metric, which helps evaluate the performance of clustering algorithms more effectively.

In the realm of classification, the Equal Error Rate (EER) metric has been added, providing a crucial measure for assessing the performance of classification models, particularly in scenarios where false positives and false negatives have different costs. Furthermore, the MeanAveragePrecision metric now includes a functional interface, enhancing its usability and flexibility for users.

These updates collectively enhance the capabilities of TorchMetrics, making it an even more comprehensive and indispensable resource for machine learning practitioners and researchers.

[1.7.0] - 2025-03-20

Added

Additions to image domain:
- Added ARNIQA metric (#2953)
- Added DeepImageStructureAndTextureSimilarity (#2993)
- Added support for more models and processors in CLIPScore (#2978)
Added JensenShannonDivergence metric to regression package (#2992)
Added ClusterAccuracy metric to cluster package (#2777)
Added Equal Error Rate (EER) to classification package (#3013)
Added functional interface to MeanAveragePrecision metric (#3011)

Changed

Making num_classes optional for one-hot inputs in MeanIoU (#3012)

Removed

Removed Dice from classification (#3017)

Fixed

Fixed edge case in integration between class-wise wrapper and metric tracker (#3008)
Fixed IndexError in MultiClassAccuracy when using top_k with single sample (#3021)

Key Contributors

@Isalia20, @LorenzoAgnolucci, @nathanpainchaud, @rittik9, @SkafteNicki

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v1.6.0...v1.7.0

Contributors

nathanpainchaud, SkafteNicki, and 3 other contributors

Assets 4

14 Mar 06:57

Borda

v1.6.3

3c444af

Minor patch release

[1.6.3] - 2024-03-13

Fixed

Fixed logic in how metric states referencing is handled in MetricCollection (#2990)
Fixed integration between class-wise wrapper and metric tracker (#3004)

Key Contributors

@SkafteNicki

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v1.6.2...v1.6.3

Contributors

SkafteNicki

Assets 4

03 Mar 11:25

Borda

v1.6.2

2a0a7fa

Minor patch release

[1.6.2] - 2024-02-28

Added

Added zero_division argument to DiceScore in segmentation package (#2860)
Added cache_session to DNSMOS metric to control caching behavior (#2974)
Added disable option to nan_strategy in basic aggregation metrics (#2943)

Changed

Make num_classes optional for classification in case of micro averaging (#2841)
Enhance Clip_Score to calculate similarities between same modalities (#2875)

Fixed

Fixed DiceScore when there is zero overlap between predictions and targets (#2860)
Fixed MeanAveragePrecision for average="micro" when 0 label is not present (#2968)
Fixed corner-case in PearsonCorrCoef when input is constant (#2975)
Fixed MetricCollection.update gives identical results (#2944)
Fixed missing kwargs in PIT metric for permutation wise mode (#2977)
Fixed multiple errors in the _final_aggregation function for PearsonCorrCoef (#2980)
Fixed incorrect CLIP-IQA type hints (#2952)

Key Contributors

@baskrahmer, @czmrand, @rbedyakin, @rittik9, @SkafteNicki, @wooseopkim

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v1.6.1...v1.6.2

Contributors

rbedyakin, wooseopkim, and 4 other contributors

Assets 4

25 Dec 23:50

Borda

v1.6.1

07224d3

Minor patch release

[1.6.1] - 2024-12-25

Changed

Enabled specifying weights path for FID (#2867)
Delete Device2Host caused by comm with device and host (#2840)

Fixed

Fixed plotting of multilabel confusion matrix (#2858)
Fixed issue with shared state in metric collection when using dice score (#2848)
Fixed top_k for multiclassf1score with one-hot encoding (#2839)
Fixed slow calculations of classification metrics with MPS (#2876)

Key Contributors

@Isalia20, @nkaenzig, @podgorki, @rittik9, @yuvalkirstain, @zhaozheng09

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v1.6.0...v1.6.1

Contributors

zhaozheng09, nkaenzig, and 4 other contributors

Assets 4

12 Nov 19:29

Borda

v1.6.0

58147e0

More metrics

The latest release of TorchMetrics introduces several significant enhancements and new features that will greatly benefit users across various domains. This update includes the addition of new metrics and methods that enhance the library's functionality and usability.

One of the key additions is the NISQA audio metric, which provides advanced capabilities for evaluating audio quality. In the classification domain, the new LogAUC and NegativePredictiveValue metrics offer improved tools for assessing model performance, particularly in imbalanced datasets. For regression tasks, the NormalizedRootMeanSquaredError metric has been introduced, providing a normalized measure of prediction accuracy that is less sensitive to outliers.

In the field of image segmentation, the new Dice metric enhances the evaluation of segmentation models by providing a robust measure of overlap between predicted and ground truth masks. Additionally, the merge_state method has been added to the Metric class, allowing for more efficient state management and aggregation across multiple devices or processes.

Furthermore, this release includes support for the propagation of the autograd graph in Distributed Data-Parallel (DDP) settings, enabling more efficient and scalable training of models across multiple GPUs. These enhancements collectively make TorchMetrics a more powerful and versatile tool for machine learning practitioners, enabling more accurate and efficient model evaluation across a wide range of applications.

[1.6.0] - 2024-11-12

Added

Added audio metric NISQA (#2792)
Added classification metric LogAUC (#2377)
Added classification metric NegativePredictiveValue (#2433)
Added regression metric NormalizedRootMeanSquaredError (#2442)
Added segmentation metric Dice (#2725)
Added method merge_state to Metric (#2786)
Added support for propagation of the autograd graph in DDP setting (#2754)

Changed

Changed naming and input order arguments in KLDivergence (#2800)

Deprecated

Deprecated Dice from classification metrics (#2725)

Removed

Changed minimum supported Pytorch version to 2.0 (#2671)
Dropped support for Python 3.8 (#2827)
Removed num_outputs in R2Score (#2800)

Fixed

Fixed segmentation Dice + GeneralizedDice for 2d index tensors (#2832)
Fixed mixed results of rouge_score with accumulate='best' (#2830)

Key Contributors

@Borda, @cw-tan, @philgzl, @rittik9, @SkafteNicki

New Contributors since `1.5.0`

@bfolie made their first contribution in #2793
@StalkerShurik made their first contribution in #2811
@philgzl made their first contribution in #2792
@cw-tan made their first contribution in #2754

Full Changelog: v1.5.0...v1.6.0

Contributors

bfolie, Borda, and 5 other contributors

Assets 4

Releases: Lightning-AI/torchmetrics

First video and vertex metrics

[1.8.0] - 2025-07-23

Added

Changed

Removed

Key Contributors

New Contributors

Contributors

Uh oh!

Minor patch release

[1.7.4] - 2025-07-04

Changed

Fixed

Key Contributors

Contributors

Uh oh!

Minor patch release

[1.7.3] - 2025-06-13

Fixed

Key Contributors

Contributors

Uh oh!

Minor patch release

[1.7.2] - 2025-05-27

Changed

Fixed

Key Contributors

Contributors

Uh oh!

Minor patch release

[1.7.1] - 2025-04-06

Changed

Fixed

Key Contributors

Contributors

Uh oh!

More image metrics

[1.7.0] - 2025-03-20

Added

Changed

Removed

Fixed

Key Contributors

Contributors

Uh oh!

Minor patch release

[1.6.3] - 2024-03-13

Fixed

Key Contributors

Contributors

Uh oh!

Minor patch release

[1.6.2] - 2024-02-28

Added

Changed

Fixed

Key Contributors

Contributors

Uh oh!

Minor patch release

[1.6.1] - 2024-12-25

Changed

Fixed

Key Contributors

Contributors

Uh oh!

More metrics

[1.6.0] - 2024-11-12

Added

Changed

Deprecated

Removed

Fixed

Key Contributors

New Contributors since 1.5.0

Contributors

Uh oh!

New Contributors since `1.5.0`