Skip to content

Pytorch re-implementation of Paper: SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting (IJCV 2025)

Notifications You must be signed in to change notification settings

mxin262/SwinTextSpotterv2

Repository files navigation

SwinTextSpotter

This is the pytorch implementation of Paper: SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting. The paper is available at this link.

Installation

  • Python=3.8
  • PyTorch=1.8.0, torchvision=0.9.0, cudatoolkit=11.1
  • OpenCV for visualization

Steps

  1. Install the repository (we recommend to use Anaconda for installation.)
conda create -n SWINTSv2 python=3.8 -y
conda activate SWINTSv2
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
pip install opencv-python
pip install scipy
pip install shapely
pip install rapidfuzz
pip install timm
pip install Polygon3
git clone https://github.com/mxin262/SwinTextSpotterv2.git
cd SwinTextSpotterv2
python setup.py build develop
  1. dataset path
datasets
|_ totaltext
|  |_ train_images
|  |_ test_images
|  |_ totaltext_train.json
|  |_ weak_voc_new.txt
|  |_ weak_voc_pair_list.txt
|_ mlt2017
|  |_ train_images
|  |_ annotations/icdar_2017_mlt.json
.......

Downloaded images

Downloaded label[Google Drive] [BaiduYun] PW: 46vd

Downloader lexicion[Google Drive] and place it to corresponding dataset.

You can also prepare your custom dataset following the example scripts. [example scripts]

Totaltext

To evaluate on Total Text, CTW1500, ICDAR2015, first download the zipped annotations and unzip it

  1. Pretrain SWINTSv2 (e.g., with Swin-Transformer backbone)
python projects/SWINTSv2/train_net.py \
  --num-gpus 8 \
  --config-file projects/SWINTSv2/configs/SWINTS-swin-pretrain.yaml
  1. Fine-tune model on the mixed real dataset
python projects/SWINTSv2/train_net.py \
  --num-gpus 8 \
  --config-file projects/SWINTSv2/configs/SWINTS-swin-mixtrain.yaml
  1. Fine-tune model
python projects/SWINTSv2/train_net.py \
  --num-gpus 8 \
  --config-file projects/SWINTSv2/configs/SWINTS-swin-finetune-totaltext.yaml
  1. Evaluate SWINTSv2 (e.g., with Swin-Transformer backbone)
python projects/SWINTSv2/train_net.py \
  --config-file projects/SWINTSv2/configs/SWINTS-swin-finetune-totaltext.yaml \
  --eval-only MODEL.WEIGHTS ./output/model_final.pth
  1. Visualize the detection and recognition results (e.g., with ResNet50 backbone)
python demo/demo.py \
  --config-file projects/SWINTSv2/configs/SWINTS-swin-finetune-totaltext.yaml \
  --input input1.jpg \
  --output ./output \
  --confidence-threshold 0.4 \
  --opts MODEL.WEIGHTS ./output/model_final.pth

Copyright

For commercial purpose usage, please contact Prof. Yuliang Liu: [email protected] and Prof. Lianwen Jin: [email protected]

About

Pytorch re-implementation of Paper: SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting (IJCV 2025)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published