SequenceGenie

These algorithms are tuned to exploit multiple processors. Running the code with multiple processors will result in a significant performance improvement.

Due to its use of Docker, this code is runnable (and has been tested) on cloud computing platforms such as Google Compute Engine.

From FASTA files

To run sbc-ngs from pre-compiled FASTA files of target sequences, run:

bash docker_build.sh
bash docker_run.sh [FULL_PATH_TO_DATA_DIRECTORY] [MIN_SEQ_LEN] [MAX_SEQ_FILES] (e.g. bash docker_run.sh /Users/username/SequenceGenie/example/fasta 1000 128

The value [MIN_SEQ_LEN] corresponds to the minimum sequence length of a data read to be considered in the analysis. Depending upon the length of the template sequences to be matched to, 1000 is a sensible default.

The value [MAX_SEQ_FILES] corresponds to the number of data sequence files to consider in the analysis. Note that a data sequence files (e.g. a fastq or fasta file) is likely to contain multiple reads. This value corresponds to the number of files to read, not the number of sequences. Increasing this value (considering more data) will lead to more reliable results at the expense of performance.

This uses example data provided here in the example/fasta directory. The required format of this directory follows that of example/fasta. Specifically, the directory must contain the following subdirectories:

data. This directory contains both the sequence data to be analysed (typically in fastq format, but other formats also supported), along with a file barcodes.csv which defines the barcode sequences applied to each sample and the sequence id of each sample. This file requires the headers well, known_seq_id, forward and reverse.
seqs. This directory contains fasta files of the template sequences against which the sequence data will be aligned. The fasta files must share the same names as the values in the known_seq_id column of the barcodes.csv file, described above.

From template data held in JBEI-ICE

(Note: this is the approach typically used in the SYNBIOCHEM centre.)

To run sbc-ngs from target sequences held in JBEI-ICE, run:

bash docker_build.sh
bash docker_run_ice.sh [TARGET_DIRECTORY] [ICE_USERNAME] [ICE_PASSWORD] [MIN_SEQ_LEN] [MAX_SEQ_FILES] (e.g. bash docker_run_ice.sh /Users/username/SequenceGenie/example/ice [email protected] password 1000 128

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
example		example
sbc_ngs		sbc_ngs
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
docker_build.sh		docker_build.sh
docker_run.sh		docker_run.sh
docker_run_ice.sh		docker_run_ice.sh
google_disc_mount.txt		google_disc_mount.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SequenceGenie

From FASTA files

From template data held in JBEI-ICE

About

Releases

Packages

Languages

License

neilswainston/SequenceGenie

Folders and files

Latest commit

History

Repository files navigation

SequenceGenie

From FASTA files

From template data held in JBEI-ICE

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages