Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run IGM on Curnagl with SLURM #19

Open
pjacquet1977 opened this issue Jun 12, 2024 · 2 comments
Open

Run IGM on Curnagl with SLURM #19

pjacquet1977 opened this issue Jun 12, 2024 · 2 comments

Comments

@pjacquet1977
Copy link
Collaborator

pjacquet1977 commented Jun 12, 2024

On Curnagl you can run IGM by sending a job via SLURM.

For this you need to specify the directory where your input data are located:

PATH='/scratch/pjacquet/tests/examples/OptIGM/'

Then, you can run IGM as follows:

/work/CTR/CI/DCSR/rfabbret/default/pjacquet/igm_venv_gpu/bin/igm_run --param_file=$PATH"params.json" --lncd_input_file=$PATH"input.nc"

The problem is that IGM will also look for the file "make_steps.py" in the folder "modules_custom".

If this folder located in the same directory as your input data, then it does not work because it does not find it. Not that it works if we do NOT use SLURM but simply run IGM interactively from '/scratch/pjacquet/tests/examples/OptIGM/'.

The error message states that we can put a copy of the file "make_steps.py" in the directory "igm/igm/modules/process/". I've tried and but it does not work with SLURM, and furthermore this is not very nice since one cannot make a "git pull" afterward.

Should we set PYTHONPATH to include the working directory ? or specify the folder modules_custom as an option of IGM ?

@brfi3983
Copy link
Collaborator

Hi @pjacquet1977,

I believe I know what is going on as I had to manually change the working directory when instantiating the package, which to the best of my knowledge, is not a good practice.

I will confirm that this is the issue and test it on SLURM and locally before getting back to you.

@brfi3983
Copy link
Collaborator

brfi3983 commented Sep 4, 2024

This should be

On Curnagl you can run IGM by sending a job via SLURM.

For this you need to specify the directory where your input data are located:

PATH='/scratch/pjacquet/tests/examples/OptIGM/'

Then, you can run IGM as follows:

/work/CTR/CI/DCSR/rfabbret/default/pjacquet/igm_venv_gpu/bin/igm_run --param_file=$PATH"params.json" --lncd_input_file=$PATH"input.nc"

The problem is that IGM will also look for the file "make_steps.py" in the folder "modules_custom".

If this folder located in the same directory as your input data, then it does not work because it does not find it. Not that it works if we do NOT use SLURM but simply run IGM interactively from '/scratch/pjacquet/tests/examples/OptIGM/'.

The error message states that we can put a copy of the file "make_steps.py" in the directory "igm/igm/modules/process/". I've tried and but it does not work with SLURM, and furthermore this is not very nice since one cannot make a "git pull" afterward.

Should we set PYTHONPATH to include the working directory ? or specify the folder modules_custom as an option of IGM ?

This problem should be fixed with the upcoming update, where we use hydra to manage working directories instead of the os pacakge. We will be testing running it on slurm with a single GPU as well as multiple GPU (separate jobs). Keep an eye out for that and once we have confirmed it is working, I will close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants