Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

devops: CI failure in tee when Rust tests pass #7564

Closed
oxarbitrage opened this issue Sep 14, 2023 · 12 comments · Fixed by #7580 or #7690
Closed

devops: CI failure in tee when Rust tests pass #7564

oxarbitrage opened this issue Sep 14, 2023 · 12 comments · Fixed by #7580 or #7690
Assignees
Labels
A-devops Area: Pipelines, CI/CD and Dockerfiles C-bug Category: This is a bug

Comments

@oxarbitrage
Copy link
Contributor

Describe the issue or request

It seems this is a new issue in the CI. I am not sure how often it is happening, i only saw it once by now but it might worth to have a tciket in case it gets repeated.

A pull request failed with

tee: 'standard output': Broken pipe
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 44 filtered out; finished in 2.11s
sudo docker exit status: 101
Error: Process completed with exit code 101.

at https://github.com/ZcashFoundation/zebra/actions/runs/6165092427/job/16733752210?pr=7515

Apparently, "The Rust test passes, but Docker returns an error status. This is probably an issue with the self-hosted runner changes." (that was mentioned by @teor2345 in slack).

Expected Behavior

No failure or more details on what is the failure inside docker.

Current Behavior

Failure with sudo docker exit status: 101

Possible Solution

We need to figure out if this was a temporal issue, we might have to do nothing to resolve the ticket other than just a short comment of what went wrong, provide what to do if it happens again.

Additional Information/Context

It looks like the failure is unrelated to what the pull request was doing.

Is this happening on PRs?

yes

Is this happening on the main branch?

not seen yet

@oxarbitrage oxarbitrage added C-bug Category: This is a bug A-devops Area: Pipelines, CI/CD and Dockerfiles S-needs-triage Status: A bug report needs triage labels Sep 14, 2023
@mpguerra mpguerra added this to Zebra Sep 14, 2023
@github-project-automation github-project-automation bot moved this to 🆕 New in Zebra Sep 14, 2023
@mpguerra
Copy link
Contributor

@gustavovalverde any ideas?

@teor2345 teor2345 changed the title devops: Failure maybe related to self-hosted-runners changes devops: CI failure when Rust tests pass, maybe related to self-hosted-runners changes Sep 17, 2023
@teor2345
Copy link
Contributor

This is happening multiple times a week.

@gustavovalverde
Copy link
Member

I'll have a look at this today.

@gustavovalverde
Copy link
Member

A bit of context on this error:

The error tee: 'standard output': Broken pipe typically occurs when one process in a pipeline terminates before the other process has finished reading its input. In this case, the error is likely caused by the grep command finding a match and exiting (due to --max-count=1), while tee is still trying to write to the pipe.

As re-running the same workflow just works as expected, this might be the reason.

@teor2345
Copy link
Contributor

It's possible that this error happened in some PRs because we were running multiple test commands, but that should be fixed now.

Does CI always fail when there is a tee error, or can that error happen without CI failing?
If CI always fails, removing the set -e might help here.

@gustavovalverde
Copy link
Member

that error happen without CI failing?

Yes, it happens if CI is not failing.

@mergify mergify bot closed this as completed in #7580 Sep 20, 2023
@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in Zebra Sep 20, 2023
@teor2345 teor2345 changed the title devops: CI failure when Rust tests pass, maybe related to self-hosted-runners changes devops: CI failure in tee when Rust tests pass Oct 4, 2023
@teor2345
Copy link
Contributor

teor2345 commented Oct 4, 2023

This failed in PR #7663 even though it included the fix in PR #7580 https://github.com/ZcashFoundation/zebra/actions/runs/6405648679/job/17392070905#step:9:5457

I suggest we do what we did previously to fix this, and remove the pipefail.

@teor2345
Copy link
Contributor

teor2345 commented Oct 4, 2023

@mpguerra this bug might need to be re-scheduled, because it's causing occasional CI failures in unrelated PRs.

@mpguerra
Copy link
Contributor

mpguerra commented Oct 5, 2023

@mpguerra this bug might need to be re-scheduled, because it's causing occasional CI failures in unrelated PRs.

I've scheduled this for next sprint for now since it's not happening all the time

@teor2345
Copy link
Contributor

teor2345 commented Oct 11, 2023

This is still happening, but I've only seen it in the CD tests, see the tagged PRs for details.

@teor2345 teor2345 reopened this Oct 11, 2023
@gustavovalverde
Copy link
Member

gustavovalverde commented Oct 16, 2023

I'm almost certain this was improved (or maybe fixed) by fbd6b0f#diff-4f5cabe26761257a4d685a6edc7a43e0fe0f78762f50eeb48530f2bd3b3ee7caR186

Which might be also why this week the full sync did not crash wit this error

@teor2345
Copy link
Contributor

I haven't seen this recently, are you happy to close it?

@mpguerra mpguerra removed the S-needs-triage Status: A bug report needs triage label Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-devops Area: Pipelines, CI/CD and Dockerfiles C-bug Category: This is a bug
Projects
Archived in project
4 participants