Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to calculate CAZyme TPM for metagenomic data? #184

Open
libby-natola opened this issue Jul 16, 2024 · 1 comment
Open

How to calculate CAZyme TPM for metagenomic data? #184

libby-natola opened this issue Jul 16, 2024 · 1 comment

Comments

@libby-natola
Copy link

Hello dbcan devs!

Thanks for developing the dbCAN command line tools, and thanks for providing such helpful and detailed documentation and tutorials!

I'm using dbCAN to annotate CAZymes/CGCs/substrates within some eDNA metagenomic samples. I assembled the metagenomes on my own and annotated the prokaryotic contigs using run_dbcan like so:

run_dbcan $dir/contigs5000.proks.fasta meta -c cluster --dbcan_thread 24 --tf_cpu 24 --stp_cpu 24 --hmm_cpu 24 --dia_cpu 24 --cgc_substrate --out_dir $dbcan_dir --db_dir /mnt/Genomics/Working/databases/dbCAN/db

My ultimate goal is to have the normalized abundances for CAZymes, CGCs, and substrates in TPM. I'm trying to follow the steps in Module 3 of the metagenomic example in the user guide, hoping to end up at P13, which requires the depth.txt file (which is generated using the CDS.bam/CDS.sam files, which are generated from the .ffn file). However, I don't have the .ffn files specified in the read mapping step (P8), I suspect because I ran dbcan with the 'meta' tool and didn't end up using prokka, so I can't follow the instructions as they have been written. I understand for TPM I need the depth and length of each gene and the read depth of each sample, but I'm struggling to calculate the CAZyme gene depths and lengths without the .ffn files.

How do you suggest I go about calculating the TPM in this situation. Is there an alternate way to generate the .ffn file I need? Or perhaps I could manipulate some other output file to get the required data?

Thanks very much for any guidance you can provide!

@Xinpeng021001
Copy link
Collaborator

Hi, we're updating our protocol steps and will upload a new version for that part in few days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants