The script is currently a blunt implement but there is more in there.
Download the git repo, and change your directory to its root.
git clone https://github.com/rmFlynn/rule_adjectives.git
cd rule_adjectives
Install the conda environment:
conda env create -f environment.yaml
conda activate rule_adjectives # Don't forget this step
Install the packages:
pip3 install -e ./ # yes you need the -e for now
This is a temporary method of installation while the program is in beta, but a better method will be created after that.
- You must annotate with DRAM.1.4 or newer using the FeGenie database, for iron annotations to work correctly. If you don't iclued this database Iron annotations will not be made.
- Also! You must annotate with DRAM.1.4 or newer using the Sulfer database, for sulfer annotations to work correctly. If you don't iclued this database sulfer annotations will not be made.
- If you are in Wrighton lab using zenith you can access annotations in the scripts' environment, contact Rory Flynn if you have trouble accessing the scripts' environment.
- If you are in the wrighton lab you can use the dram2 beta to add these genes. See the air tabe for instructions on how to use this script.
- Tree based improvements are in the works.
Currently you can annotate genomes and print out cause plots. The options are probably best explained by example.
First use --help
to see all options. That will provide context.
rule_adjectives --help
Usage: rule_adjectives ANNOTATIONS_TSV OUT_TSV [OPTIONS]
Using a DRAM annotations file make a table of adjectives.
annotations_tsv: Path to a DRAM annotations file.
out_tsv: Path for the output adjectives true / false table.
Options:
-a, --adjectives TEXT #Takes a string of text which equal specific adjectives in the out_tsv adjectives true / false table to report.
-p, --plot_adjectives TEXT #Takes a string of text which equal specific adjectives in the out_tsv adjectives true / false table to plot.
-g, --plot_genomes TEXT #Takes a string of text which equal the name of a specific (or subset) of genome(s) in the out_tsv adjectives true / false table to plot.
--plot_path PATH #The path that will become a folder of output plots. No path - no plots.
--rules_tsv PATH #A path of alternative rules that will overwrite the defaults when calling adjectives.
--help #Show this message and exit.
Here is an example that will make a table for all adjectives together:
rule_adjectives ./annotations.tsv ./adjectives.tsv
Here is an example that will make a table for all adjectives as well as plots for each genome of for each adjective:
rule_adjectives ./annotations.tsv ./adjectives.tsv --plot_path ./path_for_plots
Here is how you would evaluate just 3 adjective:
rule_adjectives ./annotations.tsv ./adjectives.tsv \
-a "Autotroph" \
-a "Heterotroph" \
-a "sulfatereducer"
Here is how you would evaluate just 3 adjectives and plot 2 of them for 2 genomes:
rule_adjectives ./annotations.tsv ./adjectives.tsv \
-a "Autotroph" \
-a "Heterotroph" \
-a "sulfatereducer" \
--plot_path 'plots_dec1_3' \
-p "Autotroph" \
-p "sulfatereducer" \
-g bin1.scaffold1 \
-g bin1.scaffold2
DRAM can also help you extract the gene sequences for the adjectives of your choice. This can help for things like making phylogenetic trees of Nar/Nxr/Nap genes.
First, run DRAM Adjectives using the --strainer_tsv argument.
rule_adjectives ./annotations.tsv ./adjectives.tsv --strainer_tsv strainer_output.tsv
Then, using DRAM strainer, filter out the genes for each of the positive hits. Note: You can filter out the table for any specific genes you want to pull out before you run these next commands.
DRAM.py strainer -f genes_strainer_out.fna -i strainer_output.tsv
If you want to update the rules with out udateting the adjectives you can download them with this comand. THIS IS FOR ADVAINCED USERS ONLY. Using the wrong rules will probably break your results.
wget https://raw.githubusercontent.com/rmFlynn/rule_adjectives/main/dram2/rule_adjectives/rules.tsv