Problem with finetuning speed #9

insundaycathy · 2023-08-15T11:50:57Z

Hi, thanking for the great work.
But when I tried to finetune the network on my own data, I encounter problems with efficiency.

If I set the num_workers in the Dataloader to >0, the data loading process becomes extremely slow and the loading time increases with each worker.
The time to backpropagate tho the graph (the time to execute this line of code) increases in proportion to the batch size.
scaler.scale(loss).backward()

I want to ask if this is normal in finetuning or have I somehow introduced a bug. Also, is there anyway to speed up?

klauscc · 2023-08-16T18:10:22Z

Hi, Thanks for your interests.
I encountered a similar issue on another project on other servers.
The issue may be caused by the video reader we used decord. decord seems have some issue with pytorch's multiprocess in dataloader.

My solution is to add "spawn" when creating dataloader:

dataloader = Dataloader(multiprocessing_context="spawn", 
                                          ....)

ChaofanTao mentioned this issue Sep 8, 2023

Problems about speed of pretraining #10

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with finetuning speed #9

Problem with finetuning speed #9

insundaycathy commented Aug 15, 2023

klauscc commented Aug 16, 2023

Problem with finetuning speed #9

Problem with finetuning speed #9

Comments

insundaycathy commented Aug 15, 2023

klauscc commented Aug 16, 2023