Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Troubleshooting GTDB-Tk Database Installation and Environment Configuration #596

Open
3 tasks
soojunglee98 opened this issue Jul 16, 2024 · 8 comments
Open
3 tasks
Labels
error Help required for a GTDB-Tk error.

Comments

@soojunglee98
Copy link

Environment

  • Installed via pip (include the output of pip list)
  • [0] Using a conda environment (include the output of conda list && conda list --revisions)
  • Using a Docker container (include the IMAGE ID of the container)

Debugging information

  • [0 ] gtdbtk.log has been included (drag and drop the file to upload).
  • Genomes have been included (if possible, and there are few).

Additional comments

As mentioned on this website, I installed the conda environment and tried to download the database.

When I run download.sh, it says:

"Cannot write to '/home/spotgiet/miniconda3/envs/gtdbtk-2.1.1/share/gtdbtk-2.1.1/db/gtdbtk_r207_v2_data.tar.gz' (No space left on device)."

So, I manually downloaded the database to another directory due to space limitations in my home directory. However, I encountered this error:


================================================================================
                                     ERROR                                      
________________________________________________________________________________

          The 'GTDBTK_DATA_PATH' environment variable is not defined.           

            Please set this variable to your reference data package.            
               https://github.com/Ecogenomics/GTDBTk#installation               
================================================================================

================================================================================
                                     ERROR                                      
________________________________________________________________________________

           The GTDB-Tk reference data does not exist or is corrupted.           
                GTDBTK_DATA_PATH=/path/to/unarchived/gtdbtk/data                

   Please compare the checksum to those provided in the download repository.    
          https://github.com/Ecogenomics/GTDBTk#gtdb-tk-reference-data          
================================================================================

So again, as mentioned on the website, I activated my conda environment and tried to run:

conda env config vars set GTDBTK_DATA_PATH="/scratch/raskin_root/raskin0/shared_data/Soojung_Sarah/gtdb_tk/release220"

But it keeps saying: "To make your changes take effect, please reactivate your environment." even though my conda environment is already activated. Any suggestions? Thank you so much!

@soojunglee98 soojunglee98 added the error Help required for a GTDB-Tk error. label Jul 16, 2024
@pchaumeil
Copy link
Collaborator

you need to deactivate and then reactivate the environment for the changes to be applied properly.

conda deactivate
conda activate your_gtdbtk_environment_name

This step is necessary because of the environment change.

@soojunglee98
Copy link
Author

soojunglee98 commented Jul 23, 2024 via email

@pchaumeil
Copy link
Collaborator

Do you still have the exact same error? or is your error pointing to the new release220 folder now?
i.e. is the error similar to

The GTDB-Tk reference data does not exist or is corrupted.           
GTDBTK_DATA_PATH=/path/to/unarchived/gtdbtk/data    

or

The GTDB-Tk reference data does not exist or is corrupted.           
GTDBTK_DATA_PATH=/scratch/raskin_root/raskin0/shared_data/Soojung_Sarah/gtdb_tk/release220

@pb-max
Copy link

pb-max commented Oct 26, 2024

Do you still have the exact same error? or is your error pointing to the new release220 folder now? i.e. is the error similar to

The GTDB-Tk reference data does not exist or is corrupted.           
GTDBTK_DATA_PATH=/path/to/unarchived/gtdbtk/data    

or

The GTDB-Tk reference data does not exist or is corrupted.           
GTDBTK_DATA_PATH=/scratch/raskin_root/raskin0/shared_data/Soojung_Sarah/gtdb_tk/release220

I have this problem, the release is 202, And I hadnot this problem relese220,

@pchaumeil
Copy link
Collaborator

Hi,
Apologies for the delay,
Do you still have this problem?

@soojunglee98
Copy link
Author

soojunglee98 commented Dec 17, 2024 via email

@pb-max
Copy link

pb-max commented Dec 17, 2024 via email

@pchaumeil
Copy link
Collaborator

Does it work if you export GTDBTK_DATA_PATH in the activated conda environment?

export GTDBTK_DATA_PATH=/scratch/raskin_root/raskin0/shared_data/Soojung_Sarah/gtdb_tk/release220

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error Help required for a GTDB-Tk error.
Projects
None yet
Development

No branches or pull requests

3 participants