Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edgemesh seems to have performance issues; after a period of time, UDP packets can no longer be sent out. #567

Open
zhuyaguang opened this issue Jun 17, 2024 · 3 comments · May be fixed by #569
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@zhuyaguang
Copy link

Edgemesh seems to have performance issues; after a period of time, UDP packets can no longer be sent out.

After running for a while, the UDP packets can't be sent out, but restarting the Edgemesh on the node fixes the issue.

@zhuyaguang zhuyaguang added the kind/bug Categorizes issue or PR as related to a bug. label Jun 17, 2024
@wpx1990
Copy link

wpx1990 commented Jun 20, 2024

Edgemesh version: v1.17.0

PodA which is on NodeA want to send a udp packet to PodB which is on NodeB,regardless of dns and service,the packet travels path is:

PodA---(udp)--->EdgemeshA---(p2p stream)--->EdgemeshB---(udp)--->PodB

Hundreds of udp packets have been sent by this path,and then the travel path can not work no longer.

Maybe this bug happens in EdgemeshB.

I add logs in EdgemeshB, there is error log when the travel path not work.

image

the err log is:

-- accept a new p2p stream err(stream-565: peer:12D3KooWC2TDds3pYR4HuNtrSdqezJx9wtvRoVHpsX59Gkp8YuUW: cannot reserve inbound stream: resource limit exceeded).

Then I debug EdgeMeshB, find the error occurs here:

image

I add log at line 174, and when the travel path not work, log is:

-------- Stream Limit, limit(488), rc.nstreamsIn+incount(489)

go-libp2p stream limit can visit go-libp2p-resource-manager

I add logs in function addStreams and removeStreams like below:

image

and observe that: when a udp packet travels in this path successfully, there is addStreams log, but there is not removeStreams log.

So in EdgemeshB, where should inStream.Close() been executed?

See go-libp2p example as below, inStream.Close() can be executed in streamHandler function.

image

So in edgemesh, can inStream.Close() be executed in ProxyConnUDP as below?

image

as soon as I add "defer inConn.Close()" at conn.go:58, this bug seem to be solved.

If this method to fix this bug is right, in addition to udp packet process function, please check is there the same bug in tcp packet process function, and other p2p protocol StreamHandler function.

@ravaga
Copy link

ravaga commented Jul 5, 2024

I'm experiencing a similiar issue related with UDP packets sent by pods of a daemonset to CoreDNS to retrieve the IP of a K8s service. These pods send a request per second and after some time, the connection fails. It can be solved by restarting the EdgeMesh pods, but after running for a while, it fails another time.

@wpx1990
Copy link

wpx1990 commented Aug 2, 2024

image

Additional Notes:

Udp package send with long time interval, that can cause udpconn.deadline, and this for loop will break, then this stream cannot be used no longer, but the stream is not be released.

It can cause this issue quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants