-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom test grouping and test group order logic #500
base: master
Are you sure you want to change the base?
Conversation
…ordering test groups for execution
453908c
to
c7f5779
Compare
Hi @SalmonMode, Thanks a lot for the PR! As you mention this is a long-requested feature so all the work you have put into it is more than welcome! I like the approach of providing "low level" hooks that lets users and plugins customize test grouping to be sent to workers. Also so far in the review I'm very satisfied with the code quality and documentation you've put into it. 👍 I'm writing this just so to let you know that it will probably take awhile to review a PR of this magnitude, so please be patient! 😁 |
@nicoddemus haha I appreciate it 😁 . Take your time. I spent a very long time on it (granted, at least half of that time was on what to name the hooks). As a side note, the name I chose was because I figure at some point in the future, this may be expanded to allow grouping based on markers, rather than just names, but I couldn't figure out how to pull those in in a good way, and wanted to get the PR in rather than holding onto it for longer. |
Any chance to refresh this one? |
Thanks for the ping @ssbarnea! Indeed I had lost sight of this. Will make time for completing my review this week. 👍 |
@nicoddemus just reminding about the existence of this PR :) |
Heya @nicoddemus, sorry for the bump but we would be extremely grateful for a PR of this kind to be added to pytest-xdist. Is it still feasable? Has some other functionality superseded it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite large, so itll be a while before I can take a deeper look
I want to note that it's unfortunate we don't have a good way to transfer details about marks with the nodeids, we should add a hook that allows to inform the coordinator about affinities (session scope fixtures, certain configurations)
|
||
By default, there is no grouping logic and every individual test is placed in its own | ||
test group, so using the ``-n`` option will send pending tests to any worker that is | ||
available, without any guaranteed order. It should be assumed that when using this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit misleading in the sense that it suggests every test will run in isolation with their own copies of every fixture (including high-scoped fixtures like session), which is not true: it is just that each worker is its own "session", so high-scoped fixtures will live in that session as if in an isolated pytest executing.
We could rewrite that part, but just removing it altogether is an option too. What do you think?
grouping options, based on simple criteria about a test's nodeid. so you can gunarantee | ||
that certain tests are run in the same process. When they're run in the same process, | ||
you gunarantee that larger-scoped fixtures are only executed as many times as would | ||
normally be expected for the tests in the test group. But, once that test group is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The phrase "But, once that test group..." suggests that xdist might be do something special in the sense to destroy the fixtures...
Perhaps we should have a separate section explaining how fixture execution in general works in xdist: each worker is its own session, so high-scope fixtures are bound to that worker, etc. This section applies to xdist in general and is not specific to the test grouping feature.
Back to the docs at hand, we can then just discuss how tests are grouped/sent to workers, without getting into details again regarding fixture setup/teardown.
What do you think? Hope my comments make sense. 😁
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
By default, ``pytest-xdist`` doesn't group any tests together, but it provides some | ||
grouping options, based on simple criteria about a test's nodeid. so you can gunarantee |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grouping options, based on simple criteria about a test's nodeid. so you can gunarantee | |
grouping options, based on simple criteria about a test's nodeid, so you can guarantee |
groups, as it creates a new groups as needed. You can tap into this system to define | ||
your own grouping logic by using the ``pytest_xdist_set_test_group_from_nodeid``. | ||
|
||
If you define your own copy of that hook, it will be called once for every test, and the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you define your own copy of that hook, it will be called once for every test, and the | |
If you define your own implementation of that hook, it will be called once for every test, and the |
:: | ||
|
||
workqueue = { | ||
'<full>/<path>/<to>/test_module.py': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'<full>/<path>/<to>/test_module.py': { | |
'<group id 1>': { |
Perhaps use <group id 1>
to illustrate the idea that a group id is anything, not necessarily a filename?
should always be identical. This is an alias for | ||
`.registered_collections``. | ||
|
||
:node2pending: Map of nodes and the names of their pending test groups. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best to move this to the properties' docstring?
|
||
@property | ||
def collection_is_completed(self): | ||
"""Boolean indication initial test collection is complete. | ||
"""Booleanq indication initial test collection is complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Booleanq indication initial test collection is complete. | |
"""Boolean indication initial test collection is complete. |
Just so we are in the same page, you mean in this PR or in a follow up? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the implementation and it looks good, left a few more comments.
I personally would like to have some of the current code that manipulates workloads/test groups abstracted away into separate classes, instead of having all the details implemented into the scope itself, but definitely this can be done later if someone decides to tackle it.
def pytest_xdist_order_test_groups(workqueue): | ||
"""Sort the queue of test groups to determine the order they will be executed in. | ||
|
||
The ``workqueue`` is an ``OrderedDict`` containing all of the test groups in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ``workqueue`` is an ``OrderedDict`` containing all of the test groups in the | |
The ``workqueue`` is an ``OrderedDict`` of ``group => list of node ids`` containing all of the test groups in the |
single test. This is designed to be extensible so that custom grouping | ||
logic can be applied either by making a child class from this and | ||
overriding the ``get_default_test_group`` method, or by defining the | ||
``pytest_xdist_set_test_group_from_nodeid``.hook. If the hook is used, but it returns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``pytest_xdist_set_test_group_from_nodeid``.hook. If the hook is used, but it returns | |
``pytest_xdist_set_test_group_from_nodeid`` hook. If the hook is used, but it returns |
:: | ||
|
||
assigned_work = { | ||
'<worker node A>': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'<worker node A>': { | |
'gw0': { |
Using ids as we have them internally will make this easier to understand I think.
:: | ||
|
||
registered_collections = { | ||
'<worker node A>': [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'<worker node A>': [ | |
'gw0': [ |
@@ -55,3 +55,55 @@ def pytest_xdist_node_collection_finished(node, ids): | |||
@pytest.mark.firstresult | |||
def pytest_xdist_make_scheduler(config, log): | |||
""" return a node scheduler implementation """ | |||
|
|||
|
|||
@pytest.mark.trylast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pytest.mark.trylast | |
@pytest.hookspec(trylast=True, firstresult=True) |
Besides using the new syntax, I think we should add firstresult
here to make it clear in the API that we will only use the first result from the hook.
|
||
|
||
@pytest.mark.trylast | ||
def pytest_xdist_set_test_group_from_nodeid(nodeid): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def pytest_xdist_set_test_group_from_nodeid(nodeid): | |
def pytest_xdist_get_test_group_from_nodeid(nodeid): |
Perhaps "get" better conveys that we should return the test group? To me "set" conveys that I should set the test group somewhere.
|
||
|
||
class LoadScopeScheduling(object): | ||
class LoadScopeScheduling(LoadScheduling): | ||
"""Implement load scheduling across nodes, but grouping test by scope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So loadfile
and loadscope
are distribute the tests identically now? Or am I missing something?
Closes #18
It's been long requested (#18) to allow for custom logic for the distribution of tests among test groups in a simple and clean way. This adds that functionality by providing a hook that each tests nodeid is passed to, and whatever is returned by the hook for a given test is that test's test group name. All tests with the same test group name will be batched together and passed to the same worker together.
I also added another hook that grants access to the finished collection of test groups so that the order of the groups can be changed as needed. This allows for certain optimizations to be made, such as putting slow tests in the same group and placing that group in the beginning of the collection in order to speed the test suite up.
I also took the opportunity to clean up the logic quite a bit as there was a lot of repetition between the different load schedulers. Now they are unified into the same base class (except for
EachScheduling
) with the only differing logic being what their default grouping logic is. I felt I should leaveEachScheduling
alone for now, but I imagine it could also use the same base class by simply having every test in the same group, and then duplicating that group N times. But that seemed kinda memory heavy.Thanks for submitting a PR, your contribution is really appreciated!
Here's a quick checklist that should be present in PRs:
[x ] Make sure to include reasonable tests for your change if necessary
[x ] We use towncrier for changelog management, so please add a news file into the
changelog
folder following these guidelines:Name it
$issue_id.$type
for example588.bugfix
;If you don't have an issue_id change it to the PR id after creating it
Ensure type is one of
removal
,feature
,bugfix
,vendor
,doc
ortrivial
Make sure to use full sentences with correct case and punctuation, for example: