Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable populating log_context based on commit_sha/commit_id #977

Merged
merged 1 commit into from
Dec 20, 2024

Conversation

matt-codecov
Copy link
Contributor

@matt-codecov matt-codecov commented Dec 20, 2024

apparently populating the log context was slamming the commit table. so this PR stops using the commit table when populating log context

previously:

  • if commit_id or commit_sha+repo_id were available, query the commits table to populate every field
  • everything is populated this way, including commit_id (the DB PK for the commit)

now:

  • commit_id and commit_sha will be set if available but not used to query anything else
  • if repo_id is available, that will take care of populating the repo + owner

it's possible this will just move the problem to the repos table. if that is the case, it's solvable but will take a little elbow grease:

  • transform populate_from_sqlalchemy() in helpers/log_context.py into populate_log_context(kwargs) -> (Commit, Repository, Owner) in tasks/base.py
  • add a run_impl() stub to BaseCodecovTask which takes owner, commit, and repo kwargs, make all task subclasses accept the same (with default None) values
  • in several followup PRs, update task subclasses to use the passed-in models instead of re-fetching them

follow-up issue for model reuse: codecov/engineering-team#3140

@matt-codecov matt-codecov requested review from trent-codecov and a team December 20, 2024 16:42
@codecov-staging
Copy link

codecov-staging bot commented Dec 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

@codecov-qa
Copy link

codecov-qa bot commented Dec 20, 2024

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
1767 1 1766 1
View the top 1 failed tests by shortest run time
helpers/tests/unit/test_log_context.py::::test_as_dict
Stack Traces | 0.042s run time
dbsession = <sqlalchemy.orm.session.Session object at 0x7f12185b2210>
mocker = <pytest_mock.plugin.MockFixture object at 0x7f1218c73890>

    def test_as_dict(dbsession, mocker):
        owner, repo, commit = create_db_records(dbsession)
        log_context = LogContext(commit_id=commit.id_, task_name="foo", task_id="bar")
        log_context.populate_from_sqlalchemy(dbsession)
    
        mock_span = mocker.Mock()
        mock_span.trace_id = 123
        mocker.patch("helpers.log_context.get_current_span", return_value=mock_span)
    
        # `_populated_from_db` is a dataclass field that we want to strip
        # `sentry_trace_id` is a property that we want to include
>       assert log_context.as_dict() == {
            "task_name": "foo",
            "task_id": "bar",
            "commit_id": None,
            "commit_sha": commit.commitid,
            "repo_id": repo.repoid,
            "repo_name": repo.name,
            "owner_id": owner.ownerid,
            "owner_username": owner.username,
            "owner_service": owner.service,
            "owner_plan": owner.plan,
            "sentry_trace_id": 123,
            "checkpoints_data": {},
        }
E       AssertionError: assert {'checkpoints...d': None, ...} == {'checkpoints...id': 191, ...}
E         
E         Omitting 4 identical items, use -vv to show
E         Differing items:
E         {'commit_sha': None} != {'commit_sha': 'c5b67303452bbff57cc1f49984339cde39eb1db5'}
E         {'owner_username': None} != {'owner_username': 'codecove2e'}
E         {'owner_service': None} != {'owner_service': 'github'}
E         {'repo_name': None} != {'repo_name': 'example-python'}...
E         
E         ...Full output truncated (5 lines hidden), use '-vv' to show

.../tests/unit/test_log_context.py:139: AssertionError

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

Copy link

codecov-public-qa bot commented Dec 20, 2024

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
1767 1 1766 1
View the top 1 failed tests by shortest run time
helpers/tests/unit/test_log_context.py::::test_as_dict
Stack Traces | 0.042s run time
dbsession = <sqlalchemy.orm.session.Session object at 0x7f12185b2210>
mocker = <pytest_mock.plugin.MockFixture object at 0x7f1218c73890>

    def test_as_dict(dbsession, mocker):
        owner, repo, commit = create_db_records(dbsession)
        log_context = LogContext(commit_id=commit.id_, task_name="foo", task_id="bar")
        log_context.populate_from_sqlalchemy(dbsession)
    
        mock_span = mocker.Mock()
        mock_span.trace_id = 123
        mocker.patch("helpers.log_context.get_current_span", return_value=mock_span)
    
        # `_populated_from_db` is a dataclass field that we want to strip
        # `sentry_trace_id` is a property that we want to include
>       assert log_context.as_dict() == {
            "task_name": "foo",
            "task_id": "bar",
            "commit_id": None,
            "commit_sha": commit.commitid,
            "repo_id": repo.repoid,
            "repo_name": repo.name,
            "owner_id": owner.ownerid,
            "owner_username": owner.username,
            "owner_service": owner.service,
            "owner_plan": owner.plan,
            "sentry_trace_id": 123,
            "checkpoints_data": {},
        }
E       AssertionError: assert {'checkpoints...d': None, ...} == {'checkpoints...id': 191, ...}
E         
E         Omitting 4 identical items, use -vv to show
E         Differing items:
E         {'commit_sha': None} != {'commit_sha': 'c5b67303452bbff57cc1f49984339cde39eb1db5'}
E         {'owner_username': None} != {'owner_username': 'codecove2e'}
E         {'owner_service': None} != {'owner_service': 'github'}
E         {'repo_name': None} != {'repo_name': 'example-python'}...
E         
E         ...Full output truncated (5 lines hidden), use '-vv' to show

.../tests/unit/test_log_context.py:139: AssertionError

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

Copy link

✅ All tests successful. No failed tests were found.

📣 Thoughts on this report? Let Codecov know! | Powered by Codecov


# repo_id is enough to get repo and owner
elif self.repo_id:
if self.repo_id:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the intent here is to reenables the log context without the commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

the stated problem is that populating the log context slammed the commit table. turns out if we don't query the commit table the only thing we are likely to lose is the DB PK of the commit which we don't tend to use anyway. so this PR stops querying the commit table

this may just move the problem, and we should maybe do the followup described in the PR description anyway

@matt-codecov
Copy link
Contributor Author

followup task for the model reuse codecov/engineering-team#3140

@matt-codecov matt-codecov added this pull request to the merge queue Dec 20, 2024
Merged via the queue into main with commit 0fd7a81 Dec 20, 2024
22 of 23 checks passed
@matt-codecov matt-codecov deleted the pr977 branch December 20, 2024 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants