nextflow-io
diff --git a/‎docs/advanced/groovy.md
Lines changed: 88 additions & 93 deletions b/‎docs/advanced/groovy.md
Lines changed: 88 additions & 93 deletions
@@ -10,36 +10,36 @@ cd groovy
 
 Let's assume that we would like to pull in a samplesheet, parse the entries and run them through the FastP tool. So far, we have been concerned with local files, but Nextflow will handle remote files transparently:
 
-```groovy linenums="1"
+```groovy linenums="3"
+params.input = "https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/samplesheet/v3.10/samplesheet_test.csv"
+
 workflow {
-    params.input = "https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/samplesheet/v3.10/samplesheet_test.csv"
 
     Channel.fromPath(params.input)
-    | splitCsv(header: true)
-    | view
+        .splitCsv(header: true)
+        .view()
 }
 ```
 
 Let's write a small closure to parse each row into the now-familiar map + files shape. We might start by constructing the meta-map:
 
-```groovy linenums="1"
+```groovy linenums="5" hl_lines="5-8"
 workflow {
-    params.input = "https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/samplesheet/v3.10/samplesheet_test.csv"
 
-    Channel.fromPath(params.input)
-    | splitCsv(header: true)
-    | map { row ->
-        meta = row.subMap('sample', 'strandedness')
-        meta
-    }
-    | view
+    samples = Channel.fromPath(params.input)
+        .splitCsv(header: true)
+        .map { row ->
+            def meta = row.subMap('sample', 'strandedness')
+            meta
+        }
+        .view()
 }
 ```
 
-... but this precludes the possibility of adding additional columns to the samplesheet. We might to ensure the parsing will capture any extra metadata columns should they be added. Instead, let's partition the column names into those that begin with "fastq" and those that don't:
+... but this precludes the possibility of adding additional columns to the samplesheet. We might to ensure the parsing will capture any extra metadata columns should they be added. Instead, let's partition the column names into those that begin with "fastq" and those that don't. Within the map closure, let's add an additional line to partition the column names:
 
-```groovy linenums="1"
-(readKeys, metaKeys) = row.keySet().split { it =~ /^fastq/ }
+```groovy linenums="10"
+def (readKeys, metaKeys) = row.keySet().split { key -> key =~ /^fastq/ }
 ```
 
 !!! note "New methods"
@@ -48,26 +48,24 @@ workflow {
 
     We're also using the `.split()` method, which divides collection based on the return value of the closure. The mrhaki blog [provides a succinct summary](https://blog.mrhaki.com/2009/12/groovy-goodness-splitting-with-closures.html).
 
-From here, let's
+From here, let's add another line collect the values of the read keys into a list of file objects:
 
-```groovy linenums="1"
-reads = row.subMap(readKeys).values().collect { file(it) }
+```groovy linenums="11"
+def reads = row.subMap(readKeys).values().collect { value -> file(value) }
 ```
 
 ... but we run into an error:
 
-```groovy linenums="1"
+```groovy
 Argument of `file` function cannot be empty
 ```
 
-If we have a closer look at the samplesheet, we notice that not all rows have two read pairs. Let's add a condition
+If we have a closer look at the samplesheet, we notice that not all rows have two read pairs. Let's add a condition to the collect method to only include the values that are not empty:
 
-```groovy linenums="1"
-reads = row
-.subMap(readKeys)
-.values()
-.findAll { it != "" } // Single-end reads will have an empty string
-.collect { file(it) } // Turn those strings into paths
+```groovy linenums="11"
+def reads = row.subMap(readKeys).values()
+    .findAll { value -> value != "" } // Single-end reads will have an empty string
+    .collect { path -> file(path) }
 ```
 
 Now we need to construct the meta map. Let's have a quick look at the FASTP module that I've already pre-defined:
@@ -94,37 +92,40 @@ process FASTP {
 
 I can see that we require two extra keys, `id` and `single_end`:
 
-```groovy linenums="1"
-meta = row.subMap(metaKeys)
-meta.id ?= meta.sample
-meta.single_end = reads.size == 1
+```groovy linenums="14" hl_lines="1-3"
+def meta = row.subMap(metaKeys)
+meta = meta + [ id: meta.sample, single_end: reads.size == 1 ]
+[meta, reads]
 ```
 
 This is now able to be passed through to our FASTP process:
 
-```groovy linenums="1"
-Channel.fromPath(params.input)
-| splitCsv(header: true)
-| map { row ->
-    (readKeys, metaKeys) = row.keySet().split { it =~ /^fastq/ }
-    reads = row.subMap(readKeys).values()
-    .findAll { it != "" } // Single-end reads will have an empty string
-    .collect { file(it) } // Turn those strings into paths
-    meta = row.subMap(metaKeys)
-    meta.id ?= meta.sample
-    meta.single_end = reads.size == 1
-    [meta, reads]
-}
-| FASTP
+```groovy linenums="5" hl_lines="15 17"
+workflow {
+
+    samples = Channel.fromPath(params.input)
+        .splitCsv(header: true)
+        .map { row ->
+            def (readKeys, metaKeys) = row.keySet().split { key -> key =~ /^fastq/ }
+            def reads = row.subMap(readKeys).values()
+                .findAll { value -> value != "" } // Single-end reads will have an empty string
+                .collect { path -> file(path) }
+            def meta = row.subMap(metaKeys)
+            meta = meta + [ id: meta.sample, single_end: reads.size == 1 ]
+            [meta, reads]
+        }
 
-FASTP.out.json | view
+    FASTP(samples)
+
+    FASTP.out.json.view()
+}
 ```
 
 Let's assume that we want to pull some information out of these JSON files. To make our lives a little more convenient, let's "publish" these json files so that they are more convenient. We're going to discuss configuration more completely in a later chapter, but that's no reason not to dabble a bit here.
 
 We'd like to add a `publishDir` directive to our FASTP process.
 
-```groovy linenums="1"
+```groovy linenums="3"
 process {
     withName: 'FASTP' {
         publishDir = [
@@ -155,26 +156,18 @@ This enables us to iterate quickly to test out our JSON parsing without waiting
 nextflow run . -resume
 ```
 
-Let's consider the possibility that we'd like to capture some of these metrics so that they can be used downstream. First, we'll have a quick peek at the [Groovy docs](https://groovy-lang.org/documentation.html) and I see that I need to import a `JsonSlurper`:
-
-```groovy linenums="1"
-import groovy.json.JsonSlurper
-
-// We can also import a Yaml parser just as easily:
-//   import org.yaml.snakeyaml.Yaml
-//   new Yaml().load(new FileReader('your/data.yml'))
-```
+Let's consider the possibility that we'd like to capture some of these metrics so that they can be used downstream. First, we'll have a quick peek at the [Groovy docs](https://groovy-lang.org/documentation.html) and I see that I need to use `JsonSlurper`.
 
 Now let's create a second entrypoint to quickly pass these JSON files through some tests:
 
 !!! note "Entrypoint developing"
 
     Using a second Entrypoint allows us to do quick debugging or development using a small section of the workflow without disturbing the main flow.
 
-```groovy linenums="1"
+```groovy linenums="5"
 workflow Jsontest {
     Channel.fromPath("results/fastp/json/*.json")
-    | view
+        .view()
 }
 ```
 
@@ -184,41 +177,43 @@ which we run with
 nextflow run . -resume -entry Jsontest
 ```
 
-Let's create a small function at the top of the workflow to take the JSON path and pull out some basic metrics:
+Let's create a small function inside the workflow to take the JSON path and pull out some basic metrics:
 
-```bash
+```groovy linenums="5"
 def getFilteringResult(json_file) {
-    fastpResult = new JsonSlurper().parseText(json_file.text)
+    return new groovy.json.JsonSlurper().parseText(json_file.text)
 }
-```
 
-!!! exercise
-
-    The `fastpResult` returned from the `parseText` method is a large Map - a class which we're already familiar with. Modify the `getFilteringResult` function to return just the `after_filtering` section of the report.
+workflow Jsontest {
+    Channel.fromPath("results/fastp/json/*.json")
+        .view()
+}
+```
 
-    ??? solution
+The `fastpResult` returned from the `parseText` method is a large Map - a class which we're already familiar with. Modify the `getFilteringResult` function to return just the `after_filtering` section of the report.
 
-        Here is one potential solution.
+In the interest of brevity, here is the solution to return just the `after_filtering` section of the report:
 
-        ```groovy linenums="1"
-        def getFilteringResult(json_file) {
-            new JsonSlurper().parseText(json_file.text)
-            ?.summary
-            ?.after_filtering
-        }
-        ```
+```groovy linenums="5"
+def getFilteringResult(json_file) {
+    return new groovy.json.JsonSlurper().parseText(json_file.text)
+        ?.summary
+        ?.after_filtering
+}
+```
 
-        !!! note
+!!! note
 
-            `?.` is new notation is a null-safe access operator. The `?.summary` will access the summary property if the property exists.
+    `?.` is new notation is a null-safe access operator. The `?.summary` will access the summary property if the property exists.
 
 We can then join this new map back to the original reads using the `join` operator:
 
-```groovy linenums="1"
-FASTP.out.json
-| map { meta, json -> [meta, getFilteringResult(json)] }
-| join( FASTP.out.reads )
-| view
+```groovy linenums="31"
+    FASTP.out.json
+        .map { meta, json -> [meta, getFilteringResult(json)] }
+        .join( FASTP.out.reads )
+        .view()
+}
 ```
 
 !!! exercise
@@ -227,17 +222,17 @@ FASTP.out.json
 
     ??? solution
 
-        ```groovy linenums="1"
-        FASTP.out.json
-        | map { meta, json -> [meta, getFilteringResult(json)] }
-        | join( FASTP.out.reads )
-        | map { meta, fastpMap, reads -> [meta + fastpMap, reads] }
-        | branch { meta, reads ->
-            pass: meta.q30_rate >= 0.935
-            fail: true
+        ```groovy linenums="31"
+            reads = FASTP.out.json
+                .map { meta, json -> [meta, getFilteringResult(json)] }
+                .join( FASTP.out.reads )
+                .map { meta, fastpMap, reads -> [meta + fastpMap, reads] }
+                .branch { meta, reads ->
+                    pass: meta.q30_rate >= 0.935
+                    fail: true
+                }
+
+            reads.fail.view { meta, _reads -> "Failed: ${meta.id}" }
+            reads.pass.view { meta, _reads -> "Passed: ${meta.id}" }
         }
-        | set { reads }
-
-        reads.fail | view { meta, reads -> "Failed: ${meta.id}" }
-        reads.pass | view { meta, reads -> "Passed: ${meta.id}" }
         ```