PhylogenomicPipeline

A pipeline for running a complete phylogenomic analysis from a set of FASTA file, one for each shared ortholog group. It is based on container and Snakemake.

In short, what it does is:

Multiple sequence alignment with MAFFT
Extraction of conserved blocks with GBlocks
Generation of a phylogenetic tree for each ortholog group with RAxML
Generation of a species tree with ASTRAL

Requeriments

Snakemake
Singularity (or docker)

Instructions

Clone this repository

git clone https://github.com/fgajardoe/PhylogenomicPipeline.git

Get the container

cd PhylogenomicPipeline 
singularity pull docker://fgajardoe/phylogenomic-analysis-container:latest

Note: Although the image is hosted on DockerHub, all the pipeline uses Singularity. Adapting it shouldn't be a big deal. It'd be just needed to modify the Snakefile for running Docker commands instead of Singularity.

Build a configfile

It's a Snakemake configfile, which is a yaml formatted text file. Keep in mind the example provided here.

Remember updating the begining of the Snakefile to match your configfile.

The configfile must associate wildcards to their corresponding FASTA files, each one containing ortholog sequences for each specie considered in your analysis. In other words, you need one FASTA per ortholog group, and that file must contain the sequence of that ortholog in each specie. You can use BUSCO and extract all orthologs shared by your panel of species from its results.

Run the pipeline

conda activate snakemake
snakemake -p -j1

Good look!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

PhylogenomicPipeline

Requeriments

Instructions

Files

README.md

Latest commit

History

README.md

File metadata and controls

PhylogenomicPipeline

Requeriments

Instructions