You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* (feat): Update advanced training to new syntax
Updates to the advanced training to use the current syntax making it more consistent with the other guides.
Changes:
- Replace all pipe operators with method chaining
- Where required, assigns to intermediate channel and uses in a process
- Moves all logic into workflows
* chore: update starting code
* fix: Name all iterators in closures
* chore: remove extra files
* fix: add all named closures within workflow scope
* fix: Use def for every new variable
* fix: do not use same variable name twice
* fix: remove variables that are not used
* fix: multiMap syntax
* fix: add block to every process
* feat: adds checkIfExists to block users accidentally missing step
* fix: Add period to beginning of map to make copy+paste easier
* fix: add grouping starter back
* style: code style in grouping.md
* fix: MapReads assigning output
* fix: review fixups
* fix: Groovy training
* fix: structure training inconsistencies
* Apply suggestion from @FriederikeHanssen
Co-authored-by: Friederike Hanssen <[email protected]>
* Update docs/advanced/operators.md
Co-authored-by: Friederike Hanssen <[email protected]>
* Update docs/advanced/operators.md
Co-authored-by: Friederike Hanssen <[email protected]>
* docs: update advanced grouping training to new syntax
- Update grouping.md documentation with modern Nextflow syntax
- Modify main.nf workflow to use current DSL2 conventions
- Remove deprecated working_with_files/main.nf file
- Improve code examples and explanations for better learning experience
* feat: add structure training examples with Groovy classes
- Add Dog.groovy and Metadata.groovy classes for structure training
- Update main.nf workflow to demonstrate class usage and imports
- Modify cars.R script for better R integration example
- Update structure.md documentation with new examples
- Demonstrate proper project structure with lib/ directory
* style: Highlight code in operator tour
* style: Highlight code in metadata
* style: Line numbers and highlighting in operators
* style: Line numbers and highlighting in metadata
* style: Line numbers and highlighting in grouping
* fixup
* fixup
* style: Line numbers and highlighting in groovy
* style: Line numbers and highlighting in structure
---------
Co-authored-by: Friederike Hanssen <[email protected]>
Copy file name to clipboardExpand all lines: docs/advanced/groovy.md
+88-93Lines changed: 88 additions & 93 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,36 +10,36 @@ cd groovy
10
10
11
11
Let's assume that we would like to pull in a samplesheet, parse the entries and run them through the FastP tool. So far, we have been concerned with local files, but Nextflow will handle remote files transparently:
... but this precludes the possibility of adding additional columns to the samplesheet. We might to ensure the parsing will capture any extra metadata columns should they be added. Instead, let's partition the column names into those that begin with "fastq" and those that don't:
39
+
... but this precludes the possibility of adding additional columns to the samplesheet. We might to ensure the parsing will capture any extra metadata columns should they be added. Instead, let's partition the column names into those that begin with "fastq" and those that don't. Within the map closure, let's add an additional line to partition the column names:
40
40
41
-
```groovy linenums="1"
42
-
(readKeys, metaKeys) = row.keySet().split { it =~ /^fastq/ }
We're also using the `.split()` method, which divides collection based on the return value of the closure. The mrhaki blog [provides a succinct summary](https://blog.mrhaki.com/2009/12/groovy-goodness-splitting-with-closures.html).
50
50
51
-
From here, let's
51
+
From here, let's add another line collect the values of the read keys into a list of file objects:
def reads = row.subMap(readKeys).values().collect { value -> file(value) }
55
55
```
56
56
57
57
... but we run into an error:
58
58
59
-
```groovy linenums="1"
59
+
```groovy
60
60
Argument of `file` function cannot be empty
61
61
```
62
62
63
-
If we have a closer look at the samplesheet, we notice that not all rows have two read pairs. Let's add a condition
63
+
If we have a closer look at the samplesheet, we notice that not all rows have two read pairs. Let's add a condition to the collect method to only include the values that are not empty:
64
64
65
-
```groovy linenums="1"
66
-
reads = row
67
-
.subMap(readKeys)
68
-
.values()
69
-
.findAll { it != "" } // Single-end reads will have an empty string
70
-
.collect { file(it) } // Turn those strings into paths
65
+
```groovy linenums="11"
66
+
def reads = row.subMap(readKeys).values()
67
+
.findAll { value -> value != "" } // Single-end reads will have an empty string
68
+
.collect { path -> file(path) }
71
69
```
72
70
73
71
Now we need to construct the meta map. Let's have a quick look at the FASTP module that I've already pre-defined:
@@ -94,37 +92,40 @@ process FASTP {
94
92
95
93
I can see that we require two extra keys, `id` and `single_end`:
.findAll { value -> value != "" } // Single-end reads will have an empty string
112
+
.collect { path -> file(path) }
113
+
def meta = row.subMap(metaKeys)
114
+
meta = meta + [ id: meta.sample, single_end: reads.size == 1 ]
115
+
[meta, reads]
116
+
}
119
117
120
-
FASTP.out.json | view
118
+
FASTP(samples)
119
+
120
+
FASTP.out.json.view()
121
+
}
121
122
```
122
123
123
124
Let's assume that we want to pull some information out of these JSON files. To make our lives a little more convenient, let's "publish" these json files so that they are more convenient. We're going to discuss configuration more completely in a later chapter, but that's no reason not to dabble a bit here.
124
125
125
126
We'd like to add a `publishDir` directive to our FASTP process.
126
127
127
-
```groovy linenums="1"
128
+
```groovy linenums="3"
128
129
process {
129
130
withName: 'FASTP' {
130
131
publishDir = [
@@ -155,26 +156,18 @@ This enables us to iterate quickly to test out our JSON parsing without waiting
155
156
nextflow run . -resume
156
157
```
157
158
158
-
Let's consider the possibility that we'd like to capture some of these metrics so that they can be used downstream. First, we'll have a quick peek at the [Groovy docs](https://groovy-lang.org/documentation.html) and I see that I need to import a `JsonSlurper`:
159
-
160
-
```groovy linenums="1"
161
-
import groovy.json.JsonSlurper
162
-
163
-
// We can also import a Yaml parser just as easily:
164
-
// import org.yaml.snakeyaml.Yaml
165
-
// new Yaml().load(new FileReader('your/data.yml'))
166
-
```
159
+
Let's consider the possibility that we'd like to capture some of these metrics so that they can be used downstream. First, we'll have a quick peek at the [Groovy docs](https://groovy-lang.org/documentation.html) and I see that I need to use `JsonSlurper`.
167
160
168
161
Now let's create a second entrypoint to quickly pass these JSON files through some tests:
169
162
170
163
!!! note "Entrypoint developing"
171
164
172
165
Using a second Entrypoint allows us to do quick debugging or development using a small section of the workflow without disturbing the main flow.
173
166
174
-
```groovy linenums="1"
167
+
```groovy linenums="5"
175
168
workflow Jsontest {
176
169
Channel.fromPath("results/fastp/json/*.json")
177
-
| view
170
+
.view()
178
171
}
179
172
```
180
173
@@ -184,41 +177,43 @@ which we run with
184
177
nextflow run . -resume -entry Jsontest
185
178
```
186
179
187
-
Let's create a small function at the top of the workflow to take the JSON path and pull out some basic metrics:
180
+
Let's create a small function inside the workflow to take the JSON path and pull out some basic metrics:
188
181
189
-
```bash
182
+
```groovy linenums="5"
190
183
def getFilteringResult(json_file) {
191
-
fastpResult = new JsonSlurper().parseText(json_file.text)
184
+
return new groovy.json.JsonSlurper().parseText(json_file.text)
192
185
}
193
-
```
194
186
195
-
!!! exercise
196
-
197
-
The `fastpResult` returned from the `parseText` method is a large Map - a class which we're already familiar with. Modify the `getFilteringResult` function to return just the `after_filtering` section of the report.
187
+
workflow Jsontest {
188
+
Channel.fromPath("results/fastp/json/*.json")
189
+
.view()
190
+
}
191
+
```
198
192
199
-
??? solution
193
+
The `fastpResult` returned from the `parseText` method is a large Map - a class which we're already familiar with. Modify the `getFilteringResult` function to return just the `after_filtering` section of the report.
200
194
201
-
Here is one potential solution.
195
+
In the interest of brevity, here is the solution to return just the `after_filtering` section of the report:
202
196
203
-
```groovy linenums="1"
204
-
def getFilteringResult(json_file) {
205
-
new JsonSlurper().parseText(json_file.text)
206
-
?.summary
207
-
?.after_filtering
208
-
}
209
-
```
197
+
```groovy linenums="5"
198
+
def getFilteringResult(json_file) {
199
+
return new groovy.json.JsonSlurper().parseText(json_file.text)
200
+
?.summary
201
+
?.after_filtering
202
+
}
203
+
```
210
204
211
-
!!! note
205
+
!!! note
212
206
213
-
`?.` is new notation is a null-safe access operator. The `?.summary` will access the summary property if the property exists.
207
+
`?.` is new notation is a null-safe access operator. The `?.summary` will access the summary property if the property exists.
214
208
215
209
We can then join this new map back to the original reads using the `join` operator:
0 commit comments