-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
This is built on https://github.com/remiolsen/NGI-RADseqQC
This is a major rewrite to make this pipeline harmonious with nf-core, update the tools used (Stacks 2.0, remove read-joining, etc). Also a few other points that have been on my wishlist for improving usability.
Main tasks
- Think of a cool new name
- Make cookiecutter template
- Port over some of the processes from https://github.com/remiolsen/NGI-RADseqQC
- Use Stacks 2.0
-
Write a tool to scrape the Stacks logfiles for useful stats - Others
Core pipeline tasks
- Remove FLASH
- Make a dockerfile. Is Stacks 2.0 on bioconda?
-
Write a python script to parse denovo stacks to get: coverage, raw # sample loci, catalog loci per sample, "shared" loci histogram. Parse process_radtags also? -
Make a MultiQC configuration to import this data - Get publically available data from ENA. Make proper test data.
- Make a MultiQC module for Stacks >= 2.0
Polish
- Make a GH release
- Documentation, documentation, documentation
- Travis-CI
- Python3 support for in silico digest helper script
Others -- Stretch goals
- Think about what output files stacks should be creating by default.
-
Let the user specify which output files to create--Nah the defaults are probably fine-- Nuh-uh we need more!- genepop
- structure
-
Scripts for running the Stacks web UI-- It's been removed in v >= 2.0 - Pick a set of “best practice” parameters for Stacks and run all of these.
- Clearly report r80 statistic of each run, i.e # of polymorhic loci shared by at least 80% of individuals in the population -- http://doi.org/10.1111/2041-210X.12775
- Support running Stacks with a reference genome
- Support for premade population map file
- Support for already processed reads (skipping trimming and
process_reads
) - Option to not output trimmed and/or processed fastq files
Metadata
Metadata
Assignees
Labels
No labels