Skip to content

Commit

Permalink
Merge branch 'cw-1346' into 'dev'
Browse files Browse the repository at this point in the history
wf-basecalling v0.2.0

See merge request epi2melabs/workflows/wf-basecalling!8
See CW-1346
See CW-1374
  • Loading branch information
SamStudio8 committed Dec 15, 2022
2 parents b06b6ff + e04233a commit 2d733e8
Show file tree
Hide file tree
Showing 7 changed files with 47 additions and 24 deletions.
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,14 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [v0.2.0]
### Added
* `--basecaller_args` may be used to provide custom arguments to the basecalling process
### Changed
- Updated Dorado to v0.1.1
- Latest models are now v4.0.0
- Workflow prints a more helpful error when Dorado fails due to unknown model name

## [v0.1.2]
### Changed
- Updated description in manifest
Expand Down
12 changes: 4 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,9 @@ and aligning it with `minimap2` to produce a sorted, indexed CRAM.
## Introduction

This workflow introduces users to [`Dorado`](https://github.com/nanoporetech/dorado),
our alpha-release basecaller. `dorado` is still under active development and
has been released for evaluation purposes. This workflow will be kept updated
as new releases are made. We strongly encourage users to check the CHANGELOG for
breaking changes.
which is now our standard basecaller. `dorado` is still under active development and
will be kept updated as new releases are made. We strongly encourage users to check
the CHANGELOG for breaking changes.
## Quickstart

The workflow uses [nextflow](https://www.nextflow.io/) to manage compute and
Expand Down Expand Up @@ -56,15 +55,12 @@ The `dorado` repository has [a table of available models](https://github.com/nan

### Updating the workflow

It is strongly recommended to keep this experimental workflow updated with:
It is recommended to keep this workflow updated to take advantage of the latest basecalling models with:

```
nextflow pull epi2me-labs/wf-basecalling
```

Users are reminded that `dorado` is released for evaluation purposes only.
Users should consult the CHANGELOG to keep up to date with breaking changes.

### Workflow outputs

The primary outputs of the workflow include:
Expand Down
17 changes: 16 additions & 1 deletion basecalling.nf
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,27 @@ process dorado {
path("${chunk_idx}.ubam")
script:
def remora_model = remora_model_override ? "remora_model" : "\${DRD_MODELS_PATH}/${remora_cfg}"
def remora_args = (params.basecaller_basemod_threads > 0 && (params.remora_cfg || remora_model_override)) ? "--remora-models ${remora_model} --remora-threads ${params.basecaller_basemod_threads} --remora-batchsize 1024" : ''
def remora_args = (params.basecaller_basemod_threads > 0 && (params.remora_cfg || remora_model_override)) ? "--modified-bases-models ${remora_model}" : ''
def model_arg = basecaller_model_override ? "dorado_model" : "\${DRD_MODELS_PATH}/${basecaller_cfg}"
def basecaller_args = params.basecaller_args ?: ''
"""
echo '***'
echo 'Available models:'
list-models | sed 's,^,- ,' | sed "s,\${DRD_MODELS_PATH}/,,"
echo '***'
echo 'You selected:'
echo "Basecalling model: ${basecaller_cfg}"
echo "Remora model : ${remora_cfg}"
echo '***'
echo 'A file open error below indicates that you have entered an unknown model name.'
echo 'It is possible the model you selected worked previously but has been updated to a new version.'
echo 'Resubmit this workflow with an appropriate model from the model list above.'
echo '***'
dorado basecaller \
${model_arg} . \
${remora_args} \
${basecaller_args} \
--device ${params.cuda_device} | samtools view -b -o ${chunk_idx}.ubam -
"""
}
Expand Down
7 changes: 3 additions & 4 deletions docs/intro.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
## Introduction

This workflow introduces users to [`Dorado`](https://github.com/nanoporetech/dorado),
our alpha-release basecaller. `dorado` is still under active development and
has been released for evaluation purposes. This workflow will be kept updated
as new releases are made. We strongly encourage users to check the CHANGELOG for
breaking changes.
which is now our standard basecaller. `dorado` is still under active development and
will be kept updated as new releases are made. We strongly encourage users to check
the CHANGELOG for breaking changes.
5 changes: 1 addition & 4 deletions docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,15 +43,12 @@ The `dorado` repository has [a table of available models](https://github.com/nan

### Updating the workflow

It is strongly recommended to keep this experimental workflow updated with:
It is recommended to keep this workflow updated to take advantage of the latest basecalling models with:

```
nextflow pull epi2me-labs/wf-basecalling
```

Users are reminded that `dorado` is released for evaluation purposes only.
Users should consult the CHANGELOG to keep up to date with breaking changes.

### Workflow outputs

The primary outputs of the workflow include:
Expand Down
14 changes: 9 additions & 5 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
params {
help = false
version = false
wfversion = "v0.1.2"
wfversion = "v0.2.0"
aws_image_prefix = null
aws_queue = null
disable_ping = false
Expand All @@ -33,6 +33,7 @@ params {
/// common
basecaller_chunk_size = 25
basecaller_cfg = null
basecaller_args = null
basecaller_basemod_threads = 2
cuda_device = "cuda:all"
ubam_map_threads = 8
Expand All @@ -46,15 +47,15 @@ params {
dorado_ext = "pod5"

wf {
basecaller_container = "nanoporetech/dorado:shaa939a6e58395033a8cc78dc4977a24bf6d9e4129"
basecaller_container = "nanoporetech/dorado:sha097d9c8abc39b8266e3ee58f531f5ef8944a02c3"
example_cmd = [
"--input /path/to/my/fast5",
"--dorado_ext fast5",
"--ref /path/to/my/ref.fa",
"--out_dir /path/to/my/outputs",
"--basecaller_cfg 'dna_r10.4.1_e8.2_400bps_hac@v3.5.2'",
"--basecaller_cfg 'dna_r10.4.1_e8.2_400bps_hac@v4.0.0'",
"--basecaller_basemod_threads 2",
"--remora_cfg 'dna_r10.4.1_e8.2_400bps_hac@v3.5.2_5mCG@v2'",
"--remora_cfg 'dna_r10.4.1_e8.2_400bps_hac@v4.0.0_5mCG_5hmCG@v2'",
]
agent = null
}
Expand All @@ -67,7 +68,7 @@ manifest {
description = 'Helper workflow for basecalling ONT reads.'
mainScript = 'main.nf'
nextflowVersion = '>=21.05.0'
version = '0.1.2'
version = '0.2.0'
}

epi2melabs {
Expand Down Expand Up @@ -144,14 +145,17 @@ profiles {
timeline {
enabled = true
file = "${params.out_dir}/execution/timeline.html"
overwrite = true
}
report {
enabled = true
file = "${params.out_dir}/execution/report.html"
overwrite = true
}
trace {
enabled = true
file = "${params.out_dir}/execution/trace.txt"
overwrite = true
}

env {
Expand Down
8 changes: 6 additions & 2 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,10 @@
"format": "directory-path",
"description": "Override the inferred model with a custom remora model",
"help_text": "For typical use, users should set --remora_cfg which will use a named model from inside the container. Experimental or custom models will not be available in the container and can be loaded from the host with --remora_model_path."
},
"basecaller_args": {
"type": "string",
"description": "Additional command line arguments to pass to the basecaller process."
}
},
"required": []
Expand Down Expand Up @@ -210,7 +214,7 @@
},
"wfversion": {
"type": "string",
"default": "v0.1.2",
"default": "v0.2.0",
"hidden": true
},
"monochrome_logs": {
Expand All @@ -225,7 +229,7 @@
}
},
"docs": {
"intro": "## Introduction\n\nThis workflow introduces users to [`Dorado`](https://github.com/nanoporetech/dorado),\nour alpha-release basecaller. `dorado` is still under active development and\nhas been released for evaluation purposes. This workflow will be kept updated\nas new releases are made. We strongly encourage users to check the CHANGELOG for\nbreaking changes.\n",
"intro": "## Introduction\n\nThis workflow introduces users to [`Dorado`](https://github.com/nanoporetech/dorado),\nwhich is now our standard basecaller. `dorado` is still under active development and\nwill be kept updated as new releases are made. We strongly encourage users to check\nthe CHANGELOG for breaking changes.\n",
"links": "## Useful links\n\n* [nextflow](https://www.nextflow.io/)\n* [docker](https://www.docker.com/products/docker-desktop)\n* [singularity](https://sylabs.io/singularity/)\n* [dorado](https://github.com/nanoporetech/dorado/)\n"
}
}

0 comments on commit 2d733e8

Please sign in to comment.