-
Notifications
You must be signed in to change notification settings - Fork 10
Library statistics plots
Singlcellmultiomics can generate quality testing plots for your libraries.
First things first, First demultiplex, map and tag your files as described NlaIII,scCHIC .
Make sure to end up with a folder structure where every library is a folder which contains the demultiplexed and mapped reads:
Libraries
├─ LibraryA
│ ├─ demultiplexedR1.fastq.gz
│ ├─ demultiplexedR2.fastq.gz
│ ├─ rejectsR1.fastq.gz
│ ├─ rejectsR2.fastq.gz
│ └─ /tagged/
│ ├── sorted.bam
│ └── sorted.bai
│
├─ LibraryB
│ ├─ demultiplexedR1.fastq.gz
│ ├─ demultiplexedR2.fastq.gz
│ ├─ rejectsR1.fastq.gz
│ ├─ rejectsR2.fastq.gz
│ └─ /tagged/
│ ├── sorted.bam
│ └── sorted.bai
Then change directory to the root folder libraries
and run the libraryStatistics.py script:
libraryStatistics.py LibraryA LibraryB
If your bam file is not called sorted.bam
but something else, or the subfolder is not tagged
use the -tagged_bam parameter and supply the relative path to the tagged file.
For example:
Libraries
├─ LibraryA
│ ├─ demultiplexedR1.fastq.gz
│ ├─ demultiplexedR2.fastq.gz
│ ├─ rejectsR1.fastq.gz
│ ├─ rejectsR2.fastq.gz
│ └─ /nlatagged/
│ ├── mapped.bam
│ └── mapped.bai
For this structure use the command:
libraryStatistics.py LibraryA -tagged_bam /nlatagged/mapped.bam
The script will add two directories to every library:
./plots and ./tables
The plots directory contains plots of the various statistics and the tables directory contains files with the statistic data used for the plots in CSV format.
The following statistics are calculated:
MethylationContextHistogram
MappingQualityHistogram
OversequencingHistogram
FragmentSizeHistogram
TrimmingStats
AlleleHistogram
RejectionReasonHistogram
DataTypeHistogram
TagHistogram
PlateStatistic
ScCHICLigation
The code for these statistics are defined at singlecellmultiomics/statistic
If you don't care about how many raw reads have been lost during the mapping and tagging process you can supply a tagged bam
file directly into libraryStatistics.py
libraryStatistics.py ./LibraryA/nlatagged/mapped.bam