You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I try to run CADD on a VCF file with 200k variants, I found the prescore match step executed by extract_scored.py is pretty time consuming. I think maybe this step can be accelerated by parallel matching per chromosome.
I suggest split the prescore file to 24 pieces by chromosome and split the input VCF to pieces by chormosome as well. For each chromosome, perform the extract_scored.py once and let them perform in parallel.
If it is OK for you, I can offer a PR later. Thanks!
The text was updated successfully, but these errors were encountered:
When I try to run CADD on a VCF file with 200k variants, I found the prescore match step executed by extract_scored.py is pretty time consuming. I think maybe this step can be accelerated by parallel matching per chromosome.
I suggest split the prescore file to 24 pieces by chromosome and split the input VCF to pieces by chormosome as well. For each chromosome, perform the extract_scored.py once and let them perform in parallel.
If it is OK for you, I can offer a PR later. Thanks!
The text was updated successfully, but these errors were encountered: