Genome annotation workflow with Helixer

This workflow annotates a genome using Helixer and evaluates the quality of the annotation with BUSCO and Genome Annotation Statistics. GFFRead is used to extract predicted protein sequences, and both BUSCO and OMArk assess proteome quality. The final annotation can be visualized interactively using JBrowse.

Helixer is an annotation software with a new and different approach: it performs evidence-free predictions (no need for RNASeq data or sequence aligments), using Graphics Processing Unit (GPU), with a much faster execution time. The annotation is based on the development and use of a cross-species deep learning model. The software is used to configure and train models for ab initio prediction of gene structure. In other words, it identifies the base pairs in a genome that belong to the UTR/CDS/Intron genes.

Workflow steps

Genome annotation with Helixer
Extraction of predicted proteins from annotation with GFFRead
Evaluation of annotation
- Genome Annotation Statistics
- BUSCO (on genome and predicted proteins)
Proteome quality assessment with OMArk
Visualization of annotation with JBrowse

Input data

The ollowing input files are required for the workflow:

Genome sequence (FASTA format): The genome to be annotated. Used by Helixer, Genome Annotation Statistics, BUSCO, GFFRead, and JBrowse.

Output data

The workflow generates the following outputs:

Annotation file (GFF3): Contains the final consensus gene models produced by Helixer.
BUSCO results: Assess the completeness of the annotation and include:
- A summary of results.
- A table of all searched BUSCO genes with their status.
- A table of missing BUSCO genes.
Annotation statistics: Summary and graphical analyses of the annotation, produced by Genome Annotation Statistics.
Protein sequences (FASTA): Predicted from the annotation using GFFRead.
OMArk report: completeness, consistency, and contamination of the predicted proteome.
Genome browser visualization (HTML): An interactive genome view produced by JBrowse.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
.dockstore.yml		.dockstore.yml
CHANGELOG.md		CHANGELOG.md
Galaxy-Workflow-annotation_helixer-tests.yml		Galaxy-Workflow-annotation_helixer-tests.yml
Galaxy-Workflow-annotation_helixer.ga		Galaxy-Workflow-annotation_helixer.ga
README.md		README.md
ro-crate-metadata.json		ro-crate-metadata.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Genome annotation workflow with Helixer

Workflow steps

Input data

Output data

About

Uh oh!

Releases 3

Packages

iwc-workflows/annotation-helixer

Folders and files

Latest commit

History

Repository files navigation

Genome annotation workflow with Helixer

Workflow steps

Input data

Output data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Packages