QC cutoffs #245
Replies: 2 comments 2 replies
-
I think most important are the missing positions. We often have few missing positions in rrs/rrl genes, but they are not as important, so we are fine with it. |
Beta Was this translation helpful? Give feedback.
-
Hi @bajwamoneeb Depends a bit what you want the QC for - if it is just for drug resistnace calling then as @pmenzel suggested you could look at the "missing_pos" in the value in json output. This is populated with all the drug resistance positions that were not covered with the cutoff set by Another type of QC could check that all or a minimum number of genes have full coverage using these values. The above solution works ok if looking for resistance in the well characterised drugs e.g. rif, inh, etc. But where we don't yet know the resistance mutations (e.g. beqaquiline), it might be better to filter on the "gene_coverage" values in the json output. For each gene that is analysed the pipeline extract the fraction of the gene with <= coverage than a user-specified cut-off (default=0). For example the output below shows that all positions in the gene have > 0 coverage.
If it is more a general sample quality for further analysis like phylogenetics, then I usually use a minimum cutoff of 30x median coverage regardless of the number of reads (as the read length can vary quite a lot). |
Beta Was this translation helpful? Give feedback.
-
Hi Jody,
What QC cutoffs would you recommend regarding things like number of reads mapped, median coverage, missing positions, etc.? Some are using <10x coverage and <~8,000 reads mapped as a cutoff for failing a TB sample for example.
Beta Was this translation helpful? Give feedback.
All reactions