This repository provides the implementation code for the text classification methods applied in the analysis of SNSF grant peer review texts conducted in the following research paper:
A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports
available on arXiv as a preprint.
The paper develops a pipeline to analyze the texts of grant peer review reports using Natural Language Processing (NLP) and machine learning (ML). It defines 12 categories reflecting content of grant peer review reports which are subsequently labelled by multiple human annotators in a novel text corpus of grant peer review reports submitted to the Swiss National Science Foundation (SNSF). The annotated texts are used to fine-tune pre-trained transformer models to classify these categories at scale. The results show that many categories can be reliably identified by human annotators and machine learning approaches. However, the choice of text classification approach considerably influences the performance. The fine-tuned models are publicly shared to enable others to analyse the contents of grant peer review reports in a structured manner.
The final models fine-tuned for text classification are based on the SPECTER2 base model and are available at Hugging Face Hub. Please note that due to data protection laws, the training data cannot be shared.
Authors: Gabriel Okasa, Alberto de León, Michaela Strinzel, Anne Jorstad, Katrin Milzow, Matthias Egger, and Stefan Müller
The code scripts are located in the code
subfolder, which consists of four
subfolders that contain the implementation for the following four types of
text classification approaches:
- Binary Classification
- Multi-task Classification
- Multi-label Classification
- Few-shot Classification
of which the binary classification approach achieves the best classification accuracy as documented in the research paper.
The subfolder data
serves as a placeholder as due to data protection laws, the
input text data cannot be shared.
Similarly, the subfolder output
is a placeholder for storage of the model outputs
with corresponding subfolder for each of the four classification approaches.
The subfolder notebooks
includes tutorials on using the publicly released
models from Hugging Face to classify texts
from grant peer review reports.
The fine-tuning and prompting of the models was performed locally without access to the internet to prevent any potential data leakage or network interference.
The codes for the classification analyses use pre-trained models which are fine-tuned or prompted for classification of the set of 12 categories that reflect the content of grant peer review reports. The respective code scripts are detailed below.
-
binary_classification.py
: Script for fine-tuning pre-trained model for binary classification for each category. It performs the fine-tuning via Hugging Facetrainer
pipeline - the base model is fine-tuned separately for each category, resulting in 12 fine-tuned models. This is the primary script for the main results of the paper. -
ablation_study.py
: Script for approximation of data value via an ablation study. It fine-tunes the base model multiple times, while sequentially increasing the training set and testing the classification accuracy on a fixed set of test sentences.
multitask_classification.py
: Script for fine-tuning pre-trained model for binary classification via multi-task learning. It performs the fine-tuning viatasknet
andtrainer
, i.e. one single encoder model with different classification heads for each category via adapters.
multilabel_classification.py
: Script for fine-tuning pre-trained model for multi-label classification. It performs the fine-tuning via Hugging Facetrainer
pipeline - one single model predicting the probabilities across all categories at the same time.
-
fewshot_classification.py
: Script for prompting Llama model for binary classification of each category. It performs the few-shot learning via Hugging Facetext-generation
pipeline - one single model prompted separately for each category to predict the labels. The script deploys theMeta-Llama-3-8B-Instruct
model for inference locally. -
llama_prompts.py
: Script to define prompts for Meta's Llama model for all 12 categories. The prompts are based on the Hugging Face chat template.
Each of the classification subfolders also contains a utils.py
script which
includes a collection of utility functions.
To clone the repository run:
git clone https://github.com/snsf-data/ml-peer-review-analysis.git
The required Python
modules can be installed by navigating to the root of the
cloned project and executing the following command: pip install -r requirements.txt
.
The implementation relies on Python
version 3.12.4.
The code snippet below demonstrates a minimal example for deployment of the fine-tuned model for classifying if methods are mentioned in the grant peer review texts using the transformer's text classification pipeline.
# import transformers library
import transformers
# load tokenizer from specter2_base - the base model
tokenizer = transformers.AutoTokenizer.from_pretrained("allenai/specter2_base")
# load the SNSF fine-tuned model for classification of methods in review texts
model = transformers.AutoModelForSequenceClassification.from_pretrained("snsf-data/specter2-review-method")
# setup the classification pipeline
classification_pipeline = transformers.TextClassificationPipeline(
model=model,
tokenizer=tokenizer,
return_all_scores=True
)
# prediction for an example review sentence mentioning methods
classification_pipeline("The applicant is using statistical and analytic approaches that are appropriate.")
# prediction for an example review sentence not mentioning methods
classification_pipeline("The project deals with an undoubtedly very interesting subject.")
In addition, the subfolder notebooks
contains a detailed tutorial on deploying the
fine-tuned models from Hugging Face at scale,
including data pre-processing steps.
- arXiv preprint
- data management plan
- annotation codebook
- Hugging Face models
- archived code and models
If you have questions regarding the Python
codes, please contact
[email protected]. For general inquiries about
the research paper, please contact [email protected].
MIT © snsf-data