You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I tried to use bindash to process my fasta data. I first used the following command:
./bindash sketch mydata.fas --outfname=genomeA.sketch
The mydata.fas file size is about 50M, containing more than 20,000 nucleotide sequences. But the generated .sketch file is only 1kb. There must be something wrong, but I don't know where to modify it.
Are there any requirements for the input data format?
The text was updated successfully, but these errors were encountered:
The output file size is only related to the sketch size (--sketchsize64 M and --bbits N option) if your purpose is to compute genomic distance among your files. Sketches are just first N bits of M 64 bit integers so it is not that big. You can increase --sketchsize64 to 200 or even several thousand if you want accuracy at 99% or 99.99% ANI above (a widely used metric for genomic distance). This tool is only for genomic distance estimation, not for fastq/fasta file quality control or something.
Hi, I tried to use bindash to process my fasta data. I first used the following command:
./bindash sketch mydata.fas --outfname=genomeA.sketch
The mydata.fas file size is about 50M, containing more than 20,000 nucleotide sequences. But the generated .sketch file is only 1kb. There must be something wrong, but I don't know where to modify it.
Are there any requirements for the input data format?
The text was updated successfully, but these errors were encountered: