You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+14-8Lines changed: 14 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -156,7 +156,7 @@ Input to EGAPx is in the form of a YAML file.
156
156
157
157
158
158
### Running EGAPx with short and long RNA-seq reads
159
-
- Optionally, you can also include long reads RNA-seq data from SRA or local files using the same formatting structure for short reads, using the label `long_reads:`
159
+
- Optionally, you can also include long reads RNA-seq data from SRA or local files (FASTA or FASTQ, not BAM) using the same formatting structure for short reads, using the label `long_reads:`
160
160
161
161
```
162
162
genome: path to assembled genome in FASTA format
@@ -171,6 +171,7 @@ Input to EGAPx is in the form of a YAML file.
171
171
short_reads: txid43150[Organism] AND 75:350[ReadLength] AND illumina[Platform] AND biomol_rna[Properties]
172
172
long_reads: txid43150[Organism] AND (oxford_nanopore[Platform] OR pacbio_smrt[Platform]) AND biomol_rna[Properties]
173
173
```
174
+
- We have not rigorously tested EGAPx performance using clustered vs. non-clustered IsoSeq reads. EGAPx uses read depth for filtering and removing rare isoforms with limited support, but clustered reads will reduce compute cost.
174
175
175
176
## Input example
176
177
[Back to Top](#Contents)
@@ -310,7 +311,7 @@ If you do not have internet access from your cluster, you can run EGAPx in offli
310
311
```
311
312
rm egap*sif
312
313
singularity cache clean
313
-
singularity pull docker://ncbi/egapx:0.4.0-alpha
314
+
singularity pull docker://ncbi/egapx:0.4.1-alpha
314
315
```
315
316
316
317
- Clone the repo:
@@ -343,7 +344,7 @@ If you do not have internet access from your cluster, you can run EGAPx in offli
343
344
- Run `egapx.py` first to edit the `biowulf_cluster.config`:
:warning: The current EGAPx release (0.4.0) will produce submission files, however they are not yet GenBank submission-compliant. A subsequent patch release (likely 0.4.1) will produce GenBank-compliant files. We welcome users to try the process below to produce submission files and create a GitHub issue with errors or questions.
574
-
575
574
After annotating your genome with EGAPx, you can prepare your annotation for submission to NCBI.
576
575
577
576
### Prepare required files and metadata
@@ -585,7 +584,13 @@ You will need:
585
584
- To submit annotation for existing GenBank assemblies, you can access the BioProject information on Datasets Genome pages by searching the assembly accession at https://www.ncbi.nlm.nih.gov/datasets/genome/. locus_tag prefix is not needed in your `prepare_submission` command
586
585
587
586
- To submit annotation with new assemblies, you will need additional inputs:
588
-
- Source modifiers table file prepared from https://www.ncbi.nlm.nih.gov/WebSub/html/help/genbank-source-table.html
587
+
- Source modifiers table file (see `examples/example_source_table.src`)
- Chromosome names follow these [rules](https://www.ncbi.nlm.nih.gov/genbank/genomesubmit/#chr_names)
590
+
- Default topology is `linear`, only specify `circular` for organelles
591
+
- Unplaced sequences can be completely omitted from the file
592
+
- Rare cases of unlocalized sequences (not "the" chromosome, but with a chromosome assignment) should be included with the chromosome name in the chromosome column and blank in the location column
593
+
589
594
- Assembly data structured comment file prepared from https://submit.ncbi.nlm.nih.gov/structcomment/genomes/
590
595
- linkage evidence argument from options at https://www.ncbi.nlm.nih.gov/genbank/wgs_gapped/, e.g. `proximity-ligation` from Hi-C, `paired-ends` from Illumina
591
596
@@ -610,15 +615,16 @@ You are ready to run `prepare_submission`. See below for full list of required/o
0 commit comments