You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Impacts Hisat2 and Star aligner.
The prepDE.py script is used to calculate raw counts. It has a parameter called -l (length) which requires the user to input the average length of each read. It defaults to 75 bp. Currently, GEMmaker does not reset this parameter. This will result in incorrect raw read counts being reported to the user.
This is a problem and should be fixed, but I don't think it means GEMmaker is delivering counts that are not "correct". Even if the average read length was given properly (not the 75bp default), the count is still technically not "correct". It's an estimate using the formula reads_per_transcript = coverage * transcript_len / read_len. For example, for trimmed reads you can still have those that are shorter or longer than the average so the count won't be perfect.
I think the problem isn't with the counts being incorrect, it's more of a challenge of being able to compare across samples that have different average read lengths.
If your input files are all from the same experiment odds are all of the read lengths across the samples will be the same -- no problem.
If your input files are from samples that have different read lengths then a cross-sample normalization, I think, should clear things up. Folks with larger datasets should be checking for this and normalizing.
if your sample file is absolutely huge, then it's hard to normalize and there may be some comparison problems. But correlation analysis should still be okay (with some added noise).
Description of the bug
Impacts Hisat2 and Star aligner.
The prepDE.py script is used to calculate
raw
counts. It has a parameter called-l
(length) which requires the user to input the average length of each read. It defaults to 75 bp. Currently, GEMmaker does not reset this parameter. This will result in incorrect raw read counts being reported to the user.the code for
prepDE.py
the code which needs to be changed:
Suggested change:
pass an argument with readlength to stringtie, use this parameter
Command used and terminal output
No response
Relevant files
No response
System information
No response
The text was updated successfully, but these errors were encountered: