replace bou.ke/monkey with bytedance/mockey #570

samanhappy · 2024-09-05T09:20:33Z

As mentioned in #519, the bou.ke/monkey library has been archived for a long time and fails on the macos-arm64 architecture. Consequently, we need to find an alternative library.

After extensive testing, I found that ~~gomonkey~~ bytedance/mockey is the only library that works, although its APIs are not compatible with bou.ke/monkey. To address this, I have introduced a separate monkey module in EaseProbe that encapsulates the ~~gomonkey~~ bytedance/mockey library. This approach minimizes the cost of modifying test code and allows us to easily replace the library in the future without affecting our test code.

Key code changes include:

Makefile Update: Changed -gcflags=-l to -gcflags=all=-l to accommodate changes for ~~gomonkey~~ bytedance/mockey.
Test Files Update: Most test files only required changing the import path from bou.ke/monkey to github.com/megaease/easeprobe/monkey. However, in probe/ssh/ssh_test.go, where the original usage was unsupported, I modified the logic accordingly.
Monkey Module: In monkey/monkey.go, I used a sync.Map to store all patches, which supports reset operations. A notable method called wait employs busy waiting to prevent test failures on macos-arm64. The exact cause of these errors is unclear to me, and this is a workaround. If anyone has a better solution, please let me know, as I am not very familiar with the ARM64 architecture.

Resolve #519

codecov-commenter · 2024-09-05T09:22:32Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 86.56716% with 9 lines in your changes missing coverage. Please review.

Project coverage is 89.29%. Comparing base (eef3a25) to head (615c2f3).
Report is 65 commits behind head on main.

Files with missing lines	Patch %	Lines
monkey/monkey.go	86.56%	7 Missing and 2 partials ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #570      +/-   ##
==========================================
- Coverage   98.80%   89.29%   -9.52%     
==========================================
  Files          85       93       +8     
  Lines        5867     6528     +661     
==========================================
+ Hits         5797     5829      +32     
- Misses         51      670     +619     
- Partials       19       29      +10

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

samanhappy · 2024-09-05T09:44:45Z

Refined: The SIGBUS error still occurs occasionally, as seen here, and I haven't found a solution yet.

proditis

This is a huge change 💪 , and monkey patching is still kind of foreign to me. Testing and letting you know.

suchen-sci · 2024-09-05T14:19:45Z

Refined: The SIGBUS error still occurs occasionally, as seen here, and I haven't found a solution yet.

Try to fix it, then you will be the new monkey king!

samanhappy · 2024-09-08T06:17:17Z

Bad news: It appears that the SIGBUS error is related to the Gomonkey library, as discussed in this issue.

Good news: I've discovered a new monkey patching library bytedance/mockey. After replacing gomonkey with it, the test results are looking promising so far.

proditis

I have run the tests a few times on my system and i seem to be getting some random errors. I havent pinpointed the cause yet just letting you know in case this something you've seen also.

See attached image, these are two consecutive make test runs with no change between invocations.

edit: Sorry i forgot to say that I have made a change to the way the tests run in order to avoid caching, not sure if this is the cause for the error go test -count=1 -gcflags=all=-l -cover -race ${TEST_FLAGS} -v ./...

edit 2: Following up on this, i run the tests 100 times and this error came up 6 times. This is something we'll have to figure out before merging otherwise we will end up with random failures during tests.

samanhappy · 2024-09-08T11:51:53Z

@proditis Thanks so much for your testing efforts! The random errors are also unacceptable to me. To help us investigate further, could you please provide the following information about your test environment:

What is your system version? Is it macOS ARM64?
What are the exact error messages when the tests fail? Do the errors have anything in common, or are they entirely random?

proditis · 2024-09-08T12:13:02Z

My system is Linux x86_64 and go version go1.23.0 linux/amd64.

I will repeat the tests to try to figure out what is causing this failure. Will keep you posted

edit: Please find attached the output from two failed tests test1.log & test2.log

As far as i can see all failed tests are related to the TestOpenLogFail tests.

proditis · 2024-09-08T12:45:43Z

I took a closer look at the failed test. Do we need to monkey.patch this call?

I think that with a bit of rework on this test case we may be able to get rid of this error.

samanhappy · 2024-09-10T02:02:36Z

Sorry for the delayed response. I reviewed the two error logs and the original TestOpenLogFail code. Here are some thoughts:

Both failures occur within the lumberjack library's internal functions. We need to determine whether these errors are caused solely by lumberjack or if our patch operations are contributing to the issue.

While the patch call in the image you attached is necessary, the patch call starting at line 188 (shown below) doesn't seem to have any effect on our test:

var lum *lumberjack.Logger
monkey.PatchInstanceMethod(reflect.TypeOf(lum), "Rotate", func(_ *lumberjack.Logger) error {
    return fmt.Errorf("error")
})

I couldn't reproduce the error on my MacBook. My suggestion is to remove this patch code and run the tests again on your system to determine whether the issue originates from the library or the patch. Does that sound like a good plan?

proditis · 2024-09-10T08:07:42Z

Hi thanks for the feedback, although i still dont understand why these calls need to be patched 🤣

I commented out the following

	//var lum *lumberjack.Logger
	//	monkey.PatchInstanceMethod(reflect.TypeOf(lum), "Rotate", func(_ *lumberjack.Logger) error {
	//		return fmt.Errorf("error")
	//	})

Rerun the tests and got the following error test3.log

PS: This is how I run my tests in case it helps, i am getting a failed run every ~20-40 runs

fail=0; runs=1; while [ "$fail" -eq 0 ];do echo "[$runs] Running tests"; make test >test.log || ((fail=fail+1)); ((runs=runs+1)); done

proditis

LGTM 🎉 🎉 🎉 🎉

samanhappy · 2024-09-10T10:58:32Z

Many thanks to @proditis for helping us resolve the random errors. @suchen-sci, you can now review this PR again.

suchen-sci · 2024-09-13T02:17:09Z

Sorry for the delayed response... LGTM!

samanhappy · 2024-10-10T04:35:35Z

Oops, the SIGBUS: bus error still exists as show here !

proditis · 2024-10-11T07:46:49Z

Oops, the SIGBUS: bus error still exists as show here !

I dont have a mac to run tests, but i suspect the same errors will come up with OpenBSD as well, so i'll prepare a system to run tests and see what fails. OpenBSD is much more strict with these types of violations, so hopefully we'll be able to catch most of these errors.

Could this failure be logger related also?

PS: Maybe we need to open an issue to keep track of these crashes and close it only after we have run enough tests to ensure we fixed all such failures?

samanhappy · 2024-10-12T01:17:30Z

I'm not sure if these errors are related to the logger, but all the SIGBUS: bus error occurrences seem to be specific to macOS-arm64.

I've opened an issue as per your advice. Let's track the progress there and keep each other updated on any developments.

replace monkey with gomonkey

751cc80

samanhappy requested review from proditis and suchen-sci September 5, 2024 09:23

add more tests

9cf153e

proditis reviewed Sep 5, 2024

View reviewed changes

suchen-sci requested a review from proditis September 5, 2024 14:07

samanhappy added 2 commits September 8, 2024 10:41

replace gomonkey with mockey

87c9cc2

restore wait function

8420587

samanhappy changed the title ~~replace bou.ke/monkey with gomonkey~~ replace bou.ke/monkey with bytedance/mockey Sep 8, 2024

proditis reviewed Sep 8, 2024

View reviewed changes

proditis added 2 commits September 10, 2024 13:37

change the TestOpenLogFail case to work without monkey patching

c36b8f5

use the TempDir to avoid messing with Unix/Windows

615c2f3

proditis approved these changes Sep 10, 2024

View reviewed changes

suchen-sci approved these changes Sep 13, 2024

View reviewed changes

suchen-sci merged commit 1ccb321 into megaease:main Sep 13, 2024
6 checks passed

samanhappy mentioned this pull request Oct 12, 2024

Random test failures with SIGBUS: bus error on macos-arm64 #580

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

replace bou.ke/monkey with bytedance/mockey #570

replace bou.ke/monkey with bytedance/mockey #570

samanhappy commented Sep 5, 2024 •

edited

Loading

codecov-commenter commented Sep 5, 2024 •

edited

Loading

samanhappy commented Sep 5, 2024

proditis left a comment

suchen-sci commented Sep 5, 2024

samanhappy commented Sep 8, 2024

proditis left a comment •

edited

Loading

samanhappy commented Sep 8, 2024

proditis commented Sep 8, 2024 •

edited

Loading

proditis commented Sep 8, 2024

samanhappy commented Sep 10, 2024

proditis commented Sep 10, 2024

proditis left a comment

samanhappy commented Sep 10, 2024

suchen-sci commented Sep 13, 2024

samanhappy commented Oct 10, 2024

proditis commented Oct 11, 2024

samanhappy commented Oct 12, 2024

replace bou.ke/monkey with bytedance/mockey #570

replace bou.ke/monkey with bytedance/mockey #570

Conversation

samanhappy commented Sep 5, 2024 • edited Loading

codecov-commenter commented Sep 5, 2024 • edited Loading

Codecov Report

samanhappy commented Sep 5, 2024

proditis left a comment

Choose a reason for hiding this comment

suchen-sci commented Sep 5, 2024

samanhappy commented Sep 8, 2024

proditis left a comment • edited Loading

Choose a reason for hiding this comment

samanhappy commented Sep 8, 2024

proditis commented Sep 8, 2024 • edited Loading

proditis commented Sep 8, 2024

samanhappy commented Sep 10, 2024

proditis commented Sep 10, 2024

proditis left a comment

Choose a reason for hiding this comment

samanhappy commented Sep 10, 2024

suchen-sci commented Sep 13, 2024

samanhappy commented Oct 10, 2024

proditis commented Oct 11, 2024

samanhappy commented Oct 12, 2024

samanhappy commented Sep 5, 2024 •

edited

Loading

codecov-commenter commented Sep 5, 2024 •

edited

Loading

proditis left a comment •

edited

Loading

proditis commented Sep 8, 2024 •

edited

Loading