get_coco_api_from_dataset returns wrong dataset

## 🐛 Bug

I tried to train a detector on a custom dataset (using https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html) and the training is stuck on 60% of precision. So I started looking at the source code and noticed that get_coco_api_from_dataset() is probably doing the wrong thing. In case the Subset is used for train and test datasets, it returns its original dataset. Problem is that it returns the entire dataset, not just the test dataset, which is the part of the entire dataset. This results in slow down when building a needless full Coco dataset for evaluation. It also updates coco_evaluator in evaluate() with wrong stats because of conflicting "image_id". According to the tutorial "image_id" is unique inside one data set and this is why the dataset "index" is used. This all results in invalid evaluation stats presented during training.

## To Reproduce

Steps to reproduce the behavior:
No reproducing is needed.

@torch.no_grad()
def evaluate(model, data_loader, device):
        ...
---> coco = get_coco_api_from_dataset(data_loader.dataset) # this returns full dataset
        ...
        res = {target["image_id"].item(): output for target, output in zip(targets, outputs)}
        evaluator_time = time.time()
---> coco_evaluator.update(res) # invalid target is updated
        ...
        return coco_evaluator


## Expected behavior

Subset dataset should be used in get_coco_api_from_dataset()

## Environment

Collecting environment information...
PyTorch version: 1.6.0
Torchvision: 0.7.0
Is debug build: N/A
CUDA used to build PyTorch: N/A

OS: Ubuntu 20.04 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-10ubuntu2) 9.3.0
Clang version: Not used
CMake version: version 3.16.3

Python version: 3.8 (64-bit runtime)
Is CUDA available: yes
CUDA runtime version: 11.0.2-1
GPU models and configuration: GPU 0: GeForce GTX 1080 Ti
Nvidia driver version: 450.51.05
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.4.2

Versions of relevant libraries:
[pip3] numpy==1.17.4
[conda] Not used


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

get_coco_api_from_dataset returns wrong dataset #2619

🐛 Bug

To Reproduce

Expected behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

get_coco_api_from_dataset returns wrong dataset #2619

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions