Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Video move task often fails #356

Open
Entrivax opened this issue Jan 14, 2024 · 20 comments
Open

Video move task often fails #356

Entrivax opened this issue Jan 14, 2024 · 20 comments

Comments

@Entrivax
Copy link
Contributor

Quite often, the video move task ends up failing, even though the video was correctly moved.

image

2024-01-14 13:14:18 {"level":"debug","time":"2024-01-14T13:14:18+01:00","message":"finished vod video convert for 1988851558"}
2024-01-14 13:14:19 {"level":"debug","time":"2024-01-14T13:14:19+01:00","message":"checking if tasks are done for video 1988851558"}
2024-01-14 13:14:21 {"level":"debug","Attempt":1,"ChildWorkflowID":"1c27280f-07ed-479e-9d09-f584eb680cac_23","Namespace":"default","RunID":"1c27280f-07ed-479e-9d09-f584eb680cac","TaskQueue":"archive","WorkerID":"cc4cea6ec55c","WorkflowID":"bb9c0b61-f26b-4dc1-9269-8e82471db9e6_44","WorkflowType":"MoveVideoWorkflow","time":"2024-01-14T13:14:21+01:00","message":"ExecuteChildWorkflow"}
2024-01-14 13:14:24 {"level":"debug","ActivityID":"5","ActivityType":"MoveVideo","Attempt":1,"Namespace":"default","RunID":"7aef4ed1-3e4f-4e36-8dfe-1f6e35dc6434","TaskQueue":"archive","WorkerID":"cc4cea6ec55c","WorkflowID":"1c27280f-07ed-479e-9d09-f584eb680cac_23","WorkflowType":"MoveVideoWorkflow","time":"2024-01-14T13:14:24+01:00","message":"ExecuteActivity"}
2024-01-14 13:14:25 {"level":"debug","time":"2024-01-14T13:14:25+01:00","message":"moving file: /tmp/1988851558_043754bf-b273-11ee-a3a1-0242ac150002-video-convert.mp4 to /vods/hortyunderscore/2023-11-27-1988851558-archive-043754bf-b273-11ee-a3a1-0242ac150002/1988851558-video.mp4"}
2024-01-14 13:15:28 {"level":"debug","time":"2024-01-14T13:15:28+01:00","message":"running check live stream schedule"}
2024-01-14 13:15:28 {"level":"debug","time":"2024-01-14T13:15:28+01:00","message":"checking live channels"}
2024-01-14 13:15:28 {"level":"debug","time":"2024-01-14T13:15:28+01:00","message":"no live watched channels"}
2024-01-14 13:20:28 {"level":"debug","time":"2024-01-14T13:20:28+01:00","message":"running check live stream schedule"}
2024-01-14 13:20:28 {"level":"debug","time":"2024-01-14T13:20:28+01:00","message":"checking live channels"}
2024-01-14 13:20:28 {"level":"debug","time":"2024-01-14T13:20:28+01:00","message":"no live watched channels"}
2024-01-14 13:23:13 {"level":"warn","ActivityID":"5","ActivityType":"DownloadTwitchVideo","Attempt":1,"Error":"context deadline exceeded","Namespace":"default","RunID":"458fabf3-90c4-4244-a820-b2fd81554cba","TaskQueue":"video-download","WorkerID":"cc4cea6ec55c","WorkflowID":"82d2649a-9c03-49b2-925e-97191cc486c9_5","WorkflowType":"DownloadTwitchVideoWorkflow","time":"2024-01-14T13:23:13+01:00","message":"RecordActivityHeartbeat with error"}
2024-01-14 13:23:44 {"level":"warn","ActivityID":"5","ActivityType":"MoveVideo","Attempt":1,"Error":"context deadline exceeded","Namespace":"default","RunID":"7aef4ed1-3e4f-4e36-8dfe-1f6e35dc6434","TaskQueue":"archive","WorkerID":"cc4cea6ec55c","WorkflowID":"1c27280f-07ed-479e-9d09-f584eb680cac_23","WorkflowType":"MoveVideoWorkflow","time":"2024-01-14T13:23:44+01:00","message":"RecordActivityHeartbeat with error"}
2024-01-14 13:24:13 {"level":"warn","ActivityID":"5","ActivityType":"DownloadTwitchVideo","Attempt":1,"Error":"context canceled","Namespace":"default","RunID":"458fabf3-90c4-4244-a820-b2fd81554cba","TaskQueue":"video-download","WorkerID":"cc4cea6ec55c","WorkflowID":"82d2649a-9c03-49b2-925e-97191cc486c9_5","WorkflowType":"DownloadTwitchVideoWorkflow","time":"2024-01-14T13:24:13+01:00","message":"RecordActivityHeartbeat with error"}
2024-01-14 13:24:44 {"level":"warn","ActivityID":"5","ActivityType":"MoveVideo","Attempt":1,"Error":"context canceled","Namespace":"default","RunID":"7aef4ed1-3e4f-4e36-8dfe-1f6e35dc6434","TaskQueue":"archive","WorkerID":"cc4cea6ec55c","WorkflowID":"1c27280f-07ed-479e-9d09-f584eb680cac_23","WorkflowType":"MoveVideoWorkflow","time":"2024-01-14T13:24:44+01:00","message":"RecordActivityHeartbeat with error"}
2024-01-14 13:24:51 {"level":"debug","time":"2024-01-14T13:24:51+01:00","message":"deleting file: /tmp/1988851558_043754bf-b273-11ee-a3a1-0242ac150002-video.mp4"}
2024-01-14 13:24:54 {"level":"error","ActivityType":"MoveVideo","Attempt":1,"Error":"context canceled","Namespace":"default","RunID":"7aef4ed1-3e4f-4e36-8dfe-1f6e35dc6434","TaskQueue":"archive","WorkerID":"cc4cea6ec55c","WorkflowID":"1c27280f-07ed-479e-9d09-f584eb680cac_23","time":"2024-01-14T13:24:54+01:00","message":"Activity error."}
2024-01-14 13:25:04 {"level":"info","Error":"context deadline exceeded","Namespace":"default","TaskQueue":"archive","WorkerID":"cc4cea6ec55c","WorkerType":"ActivityWorker","time":"2024-01-14T13:25:04+01:00","message":"Task processing failed with error"}
2024-01-14 13:25:13 {"level":"warn","ActivityID":"5","ActivityType":"DownloadTwitchVideo","Attempt":1,"Error":"context canceled","Namespace":"default","RunID":"458fabf3-90c4-4244-a820-b2fd81554cba","TaskQueue":"video-download","WorkerID":"cc4cea6ec55c","WorkflowID":"82d2649a-9c03-49b2-925e-97191cc486c9_5","WorkflowType":"DownloadTwitchVideoWorkflow","time":"2024-01-14T13:25:13+01:00","message":"RecordActivityHeartbeat with error"}

[...]

2024-01-14 13:36:49 {"level":"debug","time":"2024-01-14T13:36:49+01:00","message":"moving file: /tmp/1988851558_043754bf-b273-11ee-a3a1-0242ac150002-video-convert.mp4 to /vods/hortyunderscore/2023-11-27-1988851558-archive-043754bf-b273-11ee-a3a1-0242ac150002/1988851558-video.mp4"}
2024-01-14 13:36:49 {"level":"error","ActivityType":"MoveVideo","Attempt":3,"Error":"error opening file: open /tmp/1988851558_043754bf-b273-11ee-a3a1-0242ac150002-video-convert.mp4: no such file or directory","Namespace":"default","RunID":"7aef4ed1-3e4f-4e36-8dfe-1f6e35dc6434","TaskQueue":"archive","WorkerID":"cc4cea6ec55c","WorkflowID":"1c27280f-07ed-479e-9d09-f584eb680cac_23","time":"2024-01-14T13:36:49+01:00","message":"Activity error."}
2024-01-14 13:36:49 {"level":"warn","ActivityID":"5","ActivityType":"MoveVideo","Attempt":3,"Error":"context canceled","Namespace":"default","RunID":"7aef4ed1-3e4f-4e36-8dfe-1f6e35dc6434","TaskQueue":"archive","WorkerID":"cc4cea6ec55c","WorkflowID":"1c27280f-07ed-479e-9d09-f584eb680cac_23","WorkflowType":"MoveVideoWorkflow","time":"2024-01-14T13:36:49+01:00","message":"RecordActivityHeartbeat with error"}

After this, the video is not present in the /tmp dir anymore, but was correctly moved in the destination directory, as you can see:
image

From the logs above, it looks like the move task is taking too long, going to timeout, but still executing, then when wanting to retry, the source file is not here anymore.

@Zibbp
Copy link
Owner

Zibbp commented Jan 14, 2024

I believe this is a similar issue to #354. The heartbeats inside of the long running functions aren't properly ending when the task finished causing weird issues.

@Entrivax
Copy link
Contributor Author

Yep, looks like it

@Zibbp
Copy link
Owner

Zibbp commented Jan 16, 2024

I have some changes to the Heartbeat available using the :main image tag if you'd like to test it out. Not sure if it will fix this issue but I saw the heartbeat error in the logs.

@KyuubiDDragon
Copy link

i also tried to use the main, because i have the same issue.
ghcr.io/zibbp/ganymede:main etc but got an error after the pulling because manifest unknown

@Zibbp
Copy link
Owner

Zibbp commented Feb 16, 2024

i also tried to use the main, because i have the same issue. ghcr.io/zibbp/ganymede:main etc but got an error after the pulling because manifest unknown

I'm assuming you're running on an ARM cpu? If so, you'll need to specify the arm-specific tag if you want to run off of main, main-arm64. The latest tag combines both x86 and arm so that's why it works for that.

@KyuubiDDragon
Copy link

main-arm64 same problem. only latest works fine for me. i tried with main and main-arm64

@Zibbp
Copy link
Owner

Zibbp commented Feb 16, 2024

What CPU is this running on? And to confirm, you're only changing the API container image? That's the only one required for this change.

@KyuubiDDragon
Copy link

Thats the issue. i changed all the 3 containers :). i will now test if the download will work correctly :)

@KyuubiDDragon
Copy link

A short feedback:
With the api changes, theres the workflow tab that doesnt work.
All the other tabs works fine.

f8 error
TypeError: w is undefined NextJS 69 [_app-460189a8d20acafc.js:9:2661](http://192.168.50.51:4801/_next/static/chunks/pages/_app-460189a8d20acafc.js) TypeError: w is undefined NextJS 69 [_app-460189a8d20acafc.js:9:2661](http://192.168.50.51:4801/_next/static/chunks/pages/_app-460189a8d20acafc.js) A client-side exception has occurred, see here for more info: https://nextjs.org/docs/messages/client-side-exception-occurred [_app-460189a8d20acafc.js:9:2661](http://192.168.50.51:4801/_next/static/chunks/pages/_app-460189a8d20acafc.js)

@Zibbp
Copy link
Owner

Zibbp commented Feb 16, 2024

Is that error from the browser console? If you go to the network tab of the browser tools, any failed requests? Is this just the /workflows page?

@KyuubiDDragon
Copy link

That error is from the browser console (f8)
after the click there are 2 files: active?next_page_token= and closed?next_page_token= but without something after the =

The answer from the get is "null"
grafik

@Zibbp
Copy link
Owner

Zibbp commented Feb 19, 2024

v2.1.0 has been released with some fixes for the original issue in this post. Can you try it out and let me know if the video move error still happens?

That error is from the browser console (f8) after the click there are 2 files: active?next_page_token= and closed?next_page_token= but without something after the =

The answer from the get is "null" grafik

Can you try the newly released API and frontend version and let me know if this still happens? v2.1.0

@KyuubiDDragon
Copy link

i tried it again, and i have these logs:

2024/02/19 12:25:22 stdout {"level":"error","ActivityType":"MoveVideo","Attempt":3,"Error":"error opening file: open /tmp/2064978132_f68c286d-cf0c-11ee-ba64-0242c0a81005-video-convert.mp4: no such file or directory","Namespace":"default","RunID":"41e056a7-f643-48e0-b4b6-d34f84e71cc9","TaskQueue":"archive","WorkerID":"0e599836dbb4","WorkflowID":"975ed816-f3e9-4060-ac3f-b0f4c5f4d4f3_23","time":"2024-02-19T11:25:22Z","message":"Activity error."} 2024/02/19 12:25:22 stderr {"level":"debug","time":"2024-02-19T11:25:22Z","message":"moving file: /tmp/2064978132_f68c286d-cf0c-11ee-ba64-0242c0a81005-video-convert.mp4 to /vods/kyuubiddragon/2024-02-17-2064978132-archive-f68c286d-cf0c-11ee-ba64-0242c0a81005/2064978132-video.mp4"} 2024/02/19 12:23:22 stdout {"level":"error","ActivityType":"MoveVideo","Attempt":2,"Error":"error opening file: open /tmp/2064978132_f68c286d-cf0c-11ee-ba64-0242c0a81005-video-convert.mp4: no such file or directory","Namespace":"default","RunID":"41e056a7-f643-48e0-b4b6-d34f84e71cc9","TaskQueue":"archive","WorkerID":"0e599836dbb4","WorkflowID":"975ed816-f3e9-4060-ac3f-b0f4c5f4d4f3_23","time":"2024-02-19T11:23:22Z","message":"Activity error."} 2024/02/19 12:23:22 stderr {"level":"debug","time":"2024-02-19T11:23:22Z","message":"moving file: /tmp/2064978132_f68c286d-cf0c-11ee-ba64-0242c0a81005-video-convert.mp4 to /vods/kyuubiddragon/2024-02-17-2064978132-archive-f68c286d-cf0c-11ee-ba64-0242c0a81005/2064978132-video.mp4"} 2024/02/19 12:23:02 stderr {"level":"debug","time":"2024-02-19T11:23:02Z","message":"deleting file: /tmp/2064978132_f68c286d-cf0c-11ee-ba64-0242c0a81005-video.mp4"} 2024/02/19 12:15:32 stderr {"level":"debug","time":"2024-02-19T11:15:32Z","message":"moving file: /tmp/2064978132_f68c286d-cf0c-11ee-ba64-0242c0a81005-video-convert.mp4 to /vods/kyuubiddragon/2024-02-17-2064978132-archive-f68c286d-cf0c-11ee-ba64-0242c0a81005/2064978132-video.mp4"} 2024/02/19 12:05:25 stderr {"level":"debug","time":"2024-02-19T11:05:25Z","message":"video convert args: [-y -hide_banner -i /tmp/2064978132_f68c286d-cf0c-11ee-ba64-0242c0a81005-video.mp4 -c:v copy -c:a copy /tmp/2064978132_f68c286d-cf0c-11ee-ba64-0242c0a81005-video-convert.mp4]"} 2024/02/19 11:48:56 stderr {"level":"debug","time":"2024-02-19T10:48:56Z","message":"video convert args: [-y -hide_banner -i /tmp/2064978132_f68c286d-cf0c-11ee-ba64-0242c0a81005-video.mp4 -c:v copy -c:a copy /tmp/2064978132_f68c286d-cf0c-11ee-ba64-0242c0a81005-video-convert.mp4]"} 2024/02/19 10:56:01 stderr {"level":"debug","time":"2024-02-19T09:56:01Z","message":"moving file: /tmp/2064978132_f68c286d-cf0c-11ee-ba64-0242c0a81005-chat.json to /vods/kyuubiddragon/2024-02-17-2064978132-archive-f68c286d-cf0c-11ee-ba64-0242c0a81005/2064978132-chat.json"}

In that case you can see, that he moved the file sucessfully and then try it multiple times again, but the file is not available and then i fails

@KyuubiDDragon
Copy link

@Zibbp Do you have a fix for this issue?

@Zibbp
Copy link
Owner

Zibbp commented Feb 22, 2024

Is this a live stream archive? If possible, can you post the full output from when the archive started to when errors start?

@KyuubiDDragon
Copy link

No, its not a live stream archive. its a normal archive from stores vods

@FirestrikeX
Copy link

Having an issue just like this as well. Out of 10 VODs, it moved all of them successfully but on 8 out of 10 of them it showed Video Move as errored so it wont every fully complete.

@Zibbp
Copy link
Owner

Zibbp commented Jun 7, 2024

Having an issue just like this as well. Out of 10 VODs, it moved all of them successfully but on 8 out of 10 of them it showed Video Move as errored so it wont every fully complete.

Can you provide logs from the API container around the time the error happens? To confirm, you're running the latest version?

@FirestrikeX
Copy link

Having an issue just like this as well. Out of 10 VODs, it moved all of them successfully but on 8 out of 10 of them it showed Video Move as errored so it wont every fully complete.

Can you provide logs from the API container around the time the error happens? To confirm, you're running the latest version?

Of course! Thanks! I deleted 3 VODs that had the error and restarted all containers. Below is the log file from the api container from when it re-aquires those 3 VODs from the channel up to the point where the status is reported as error on all 3 vods in the Queue.
https://gist.github.com/FirestrikeX/f16d20fb34e3589eadb0297b463d16bd

@Zibbp
Copy link
Owner

Zibbp commented Jun 10, 2024

The formatting of the logs you uploaded got all messed up and it's difficult to read. Can you run docker logs ganymede-api >& /tmp/ganymede-api.txt and upload the resulting file here? You may need to zip the .txt file.

I was able to archive 2159565823 on my test instance fine. It's only these 3 VODs that are having issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants