DNA-RNA_Project

Overview

This repository contains the final project of Group 5 for the DNA/RNA Dynamics course (MSc in Bioinformatics, University of Bologna, a.y. 2024/2025). It features a complete pipeline for analyzing Illumina 450K methylation data using R. The pipeline includes preprocessing (with the preprocessNoob method), quality control, normalization, statistical analysis (using Mann-Whitney U test and P-value threshold of 0.05), principal component analysis (PCA), and the identification of differentially methylated positions (DMPs) between control (CTRL) and disease (DIS) samples.

Requirements

The project was performed using R and Rstudio, analysing data from the platform Illumina HumanMethylation450k (input_data). Necessary packages were installed in our R enviroment:

install.packages(minfi)
install.packages(ggplot2)
install.packages(nitr)
install.packages(BiocManager)
install.packages(factoextra)
install.packages(cluster)
install.packages(qqman)
install.packages(gplots)

Project Workflow

This document outlines the step-by-step workflow of our DNA/RNA methylation project, comparing CTRL and DIS sample groups using Illumina microarray data.

1. Data Preparation and Import

Install and load required R packages: minfi, BiocManager, knitr.
Clean the R environment and set the working directory.
Load the sample sheet using read.metharray.sheet() to import metadata.
Read raw data using read.metharray.exp() to generate the RGset object.

2. Signal Extraction

Extract fluorescence intensity data for Red (Cy5) and Green (Cy3) channels from RGset using getRed() and getGreen().
Store the data into two separate dataframes: Red and Green.

3. Quality Control

Classify sample quality based on the percentage of failed probes:
- High quality: < 0.01%
- Good quality: < 0.2%
- Low quality: > 0.2%
- Critical quality: around 1% (may require exclusion)

4. Beta and M Value Calculation

Split samples into CTRL and DIS groups using metadata.
Create MSet.raw objects for each group.
Compute Beta values with getBeta() and M values with getM().
Calculate and plot mean methylation values for both groups.
Normalise data and asses the quality of normalisation.

5. Normalization and Batch Effect Analysis

Perform Principal Component Analysis (PCA) to assess batch effects (e.g., by Sentrix_ID).
PCA suggested that normalization did not fully correct batch effects.

6. Statistical Analysis

Perform Mann whitney U test for each probe comparing CTRL vs DIS.
Create a dataframe containing p-values.
Filter probes with p ≤ 0.05 for significance.
Plot the distribution of p-values.

7. Visualization

Generate a heatmap using the top 100 most significant probes.
Apply hierarchical clustering.

Contacts

Project members:

-Marco Cuscunà ([email protected])

-Marco Centenaro ([email protected])

-Michele Carbonieri ([email protected])

-Marina Mariano ([email protected])

-Luca Cagnini ([email protected])

-Massimo Lanari ([email protected])

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
Plots		Plots
R_scripts		R_scripts
input_data		input_data
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DNA-RNA_Project

Table of contents

Overview

Requirements

Project Workflow

1. Data Preparation and Import

2. Signal Extraction

3. Quality Control

4. Beta and M Value Calculation

5. Normalization and Batch Effect Analysis

6. Statistical Analysis

7. Visualization

Contacts

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

LucaCagnini/DNA-RNA_Project

Folders and files

Latest commit

History

Repository files navigation

DNA-RNA_Project

Table of contents

Overview

Requirements

Project Workflow

1. Data Preparation and Import

2. Signal Extraction

3. Quality Control

4. Beta and M Value Calculation

5. Normalization and Batch Effect Analysis

6. Statistical Analysis

7. Visualization

Contacts

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages