Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job Retry Project: Modifiers and MemoryModifier #11928

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

LinaresToine
Copy link

@LinaresToine LinaresToine commented Mar 12, 2024

Fixes #11881

Status

In development

Description

Introducing Modifiers, which is a feature of the RetryManager. The setup is analog to the Plugins. After a plugin has been selected with selectRetryAlgo, the RetryManagerPoller uses selectJobModifier, where the jobs with specific exit codes will get modified.

As for the Modifiers themselves, this PR only introduces the MemoryModifier, whose functions work towards modifying the job pkl file and the sandbox maxPSS parameter. To use it, the config file must be modified in a way similar to this:

config.RetryManager.modifiers={50660: 'MemoryModifier'}
config.RetryManager.section_('MemoryModifier')
config.RetryManager.MemoryModifier.section_()
config.RetryManager.MemoryModifier..settings = {'requiresModify': True, 'multiplyMemoryPerCore': 200, 'maxMemoryPerCore': 2000}

Is it backward compatible (if not, which system it affects?)

Maybe. I believe it is, since the modifications only imply that the list of jobs to retry will go through an extra step before actually being retried. This patch should work on previous versions without issues.

Related PRs

This is a formal PR to the development that was being worked on by @germanfgv and me in LinaresToine#3

External dependencies / deployment changes

None

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 8 new failures
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 16 warnings and errors that must be fixed
    • 2 warnings
    • 70 comments to review
  • Pylint py3k check: failed
    • 2 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 52 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14963/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 8 new failures
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 16 warnings and errors that must be fixed
    • 2 warnings
    • 70 comments to review
  • Pylint py3k check: failed
    • 2 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 52 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14998/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 9 new failures
    • 1 tests no longer failing
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 16 warnings and errors that must be fixed
    • 2 warnings
    • 73 comments to review
  • Pylint py3k check: failed
    • 2 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 54 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14999/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 8 new failures
    • 1 tests no longer failing
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 16 warnings and errors that must be fixed
    • 2 warnings
    • 73 comments to review
  • Pylint py3k check: failed
    • 2 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 54 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15000/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 9 new failures
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 17 warnings and errors that must be fixed
    • 2 warnings
    • 74 comments to review
  • Pylint py3k check: failed
    • 2 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 55 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15001/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 8 new failures
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 18 warnings and errors that must be fixed
    • 2 warnings
    • 74 comments to review
  • Pylint py3k check: failed
    • 2 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 57 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15028/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 8 new failures
    • 4 changes in unstable tests
  • Python3 Pylint check: failed
    • 29 warnings and errors that must be fixed
    • 5 warnings
    • 101 comments to review
  • Pylint py3k check: failed
    • 2 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 58 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15032/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 9 new failures
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 29 warnings and errors that must be fixed
    • 5 warnings
    • 101 comments to review
  • Pylint py3k check: failed
    • 2 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 58 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15098/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 9 new failures
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 30 warnings and errors that must be fixed
    • 5 warnings
    • 120 comments to review
  • Pylint py3k check: failed
    • 2 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 64 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15099/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 8 new failures
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 34 warnings and errors that must be fixed
    • 5 warnings
    • 129 comments to review
  • Pylint py3k check: failed
    • 2 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 75 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15100/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 10 new failures
    • 1 tests no longer failing
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 33 warnings and errors that must be fixed
    • 5 warnings
    • 129 comments to review
  • Pylint py3k check: failed
    • 2 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 75 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15101/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 10 new failures
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 33 warnings and errors that must be fixed
    • 4 warnings
    • 109 comments to review
  • Pylint py3k check: failed
    • 1 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 79 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15103/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 10 new failures
    • 1 tests no longer failing
    • 4 changes in unstable tests
  • Python3 Pylint check: failed
    • 35 warnings and errors that must be fixed
    • 4 warnings
    • 109 comments to review
  • Pylint py3k check: failed
    • 1 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 79 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15102/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 15 new failures
    • 4 changes in unstable tests
  • Python3 Pylint check: failed
    • 35 warnings and errors that must be fixed
    • 5 warnings
    • 148 comments to review
  • Pylint py3k check: failed
    • 1 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 99 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15104/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 6 new failures
    • 43 tests deleted
    • 572 tests no longer failing
    • 183 tests added
    • 16 changes in unstable tests
  • Python3 Pylint check: failed
    • 35 warnings and errors that must be fixed
    • 5 warnings
    • 148 comments to review
  • Pylint py3k check: failed
    • 1 errors and warnings that should be fixed
  • Pycodestyle check: succeeded
    • 99 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15105/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Can one of the admins verify this patch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants