-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Diamond+CAZY analysis for mRNA reads #103
Comments
You can use the 2022 ec file. As to your question on what's next, that depends on your research aims. Very often people will take the diamond output to calculate an abundance (e.g., FPKM) value for each cazyme family. The cazyme family can be connected to EC and substrates, so that they can answer questions, e.g., what families are more highly expressed than others in what conditions/samples, or what substrate degradation is more active, etc. I've never used MEGAN, but I know it can compute the abundance info. Other choices include HUMANN3, but I don't know if they have CAZyme database integrated in their pipeline. You can look into https://github.com/AnantharamanLab/METABOLIC, which has dbCAN integrated. Yanbin |
Hi Yanbin, Thank you so much for the suggestions given. I'll look into METABOLIC as well. |
Hi @yinlabniu hi @yinlabniu , Sorry, i might have another question. As i mentioned before i ran a blastx for my nucleotide sequence against the CAZY database and below is the snapshot of how the output looks like However, when i try analyzing my .fasta (nucleotide) file using run_dbcan tool i dont seem to get any put for diamond, i only get output for eCAMI. I have tried running with both -meta and -prok. Any idea why?
Thank you |
Hi @sumitra20 . If there is no output diamond column, it means that your input sequences do not have matched annotations from databases searched against by DIAMOND software. This command is quite right. You can try this command in the example and it prompts that our program is only running diamond and eCAMI software currently.
|
Hey @yinlabniu , Yes, running the test file EscheriaColiK12MG1655.fna works. But i just find it strange as to why i can get diamond annotation output when i run
|
It could be the evalue. You used 1e-3, but within run_dbcan we have set the default evalue 1e-102. Since you are working with short reads, the default is certainly too stringent.
Yanbin
…________________________________
From: sumitra20 ***@***.***>
Sent: Tuesday, September 27, 2022 7:34 PM
To: linnabrown/run_dbcan ***@***.***>
Cc: Yanbin Yin ***@***.***>; Mention ***@***.***>
Subject: Re: [linnabrown/run_dbcan] Diamond+CAZY analysis for mRNA reads (Issue #103)
Non-NU Email
________________________________
Hey @yinlabniu<https://urldefense.com/v3/__https://github.com/yinlabniu__;!!PvXuogZ4sRB2p-tU!EYm1OEj6mvztKleFSVl-Xy_o2d3Lui7lmksd9OZXvKKp9RuBY-1GoviJegc6a0CRG3NKyTvEdkPRGGIkYkHKnA$> ,
Yes, running the test file EscheriaColiK12MG1655.fna works. But i just find it strange as to why i can get diamond annotation output when i run
diamond blastx -q /R33P5_R_sortme_non_rRNA.fq -o ./R33P5_R_cazyme.out -b 20.0 -f 6 -e 1e-3
But no any matched annotation when i use the exact same input file with run_dbcan. Is it bcs of the difference in databse used?
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/linnabrown/run_dbcan/issues/103*issuecomment-1260240789__;Iw!!PvXuogZ4sRB2p-tU!EYm1OEj6mvztKleFSVl-Xy_o2d3Lui7lmksd9OZXvKKp9RuBY-1GoviJegc6a0CRG3NKyTvEdkPRGGK2KxJI0A$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AEXNKZTK4OTPI2DGJ7Z22C3WAOHCVANCNFSM6AAAAAAQF4HWM4__;!!PvXuogZ4sRB2p-tU!EYm1OEj6mvztKleFSVl-Xy_o2d3Lui7lmksd9OZXvKKp9RuBY-1GoviJegc6a0CRG3NKyTvEdkPRGGIHeHhcqw$>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Dear Developers,
I am trying to do CAZymes analysis using my mRNA reads. I downloaded a copy of the database (CAZyDB.08062022.fa), indexed it to Diamond format, and then performed a diamond blast. I've also downloaded the mapping file (CAZyDB.08062022.fam.subfam.ec.txt) but now I'm really confused about how to proceed with the annotation and getting the annotation summary out from the generated diamond output. Can I perform it using the MEGAN6 tool? I'd really appreciate it if you could give me some ideas on how to proceed. I'm very new to bioinformatic analysis and I'm struggling to understand the analysis pipelines.
Thank you
The text was updated successfully, but these errors were encountered: