G_prompt for cut_turbo for dataset with single prompt #662

wr0124 · 2024-06-20T13:01:13Z

add G_prompt for cut_turbo for unaligned dataset and works for batch_size larger than 1

inference
unit tests
documentation

The training works with the following command line

python3 train.py
--dataroot /data1/juliew/dataset/horse2zebra
--checkpoints_dir /data1/juliew/checkpoints
--name horse2zebra_turbo
--config_json examples/example_cut_turbo_horse2zebra.json
--train_batch_size 2
--output_print_freq 10
--data_crop_size 64
--data_load_size 64
--G_prompt zebra (this option is mandatory if there is no prompt file in the dataset)

The inference works with the following command line

cd scripts
python3 gen_single_image.py
--model_in_file /data1/juliew/checkpoints/horse2zebra_turbo/latest_net_G_A.pth
--img_in /data1/juliew/dataset/horse2zebra/testA/n02381460_1000.jpg
--img_out /data1/juliew/target.jpg
--prompt zebra
--gpuid 0 \

beniz · 2024-06-21T08:06:09Z

models/cut_model.py

            else:
-                fake_B = self.netG_A(real_A_with_z)
+                self.fake_B = self.netG_A(self.real_with_z)


Remove self

beniz · 2024-06-21T08:07:52Z

models/modules/img2img_turbo/img2img_turbo.py

        # match batch size
-        captions_enc = caption_enc.repeat(x.shape[0], 1, 1)
+        batch_size = caption_enc.shape[0]


inside cut_model, x is created by cat real_A and real_B, which double the batch size of the x tensor. prompt tensor has normal batch_size, so, to match the two tensor, I did this modification. Detail toy example is here: https://colab.research.google.com/drive/1RMvHt2PuQufH4zEc2Lrds561L9NEYzYf?usp=sharing

This should be fixed by modifying the prompt tensor outside turbo, in cut when A & B are concatenated for inference, not here.

batch_size

beniz · 2024-06-24T14:15:48Z

examples/example_cut_turbo_horse2zebra.json

+        "D_lr": 0.0001,
+        "G_ema": false,
+        "G_ema_beta": 0.999,
+        "G_lr": 0.0002,


set to 0.0001

beniz · 2024-07-01T10:27:45Z

models/modules/img2img_turbo/img2img_turbo.py

@@ -201,17 +201,14 @@ def forward(self, x, prompt):
        ).input_ids.cuda()
        caption_enc = self.text_encoder(caption_tokens)[0]


Are you sure about the [0] ?

with refs:
1.https://huggingface.co/transformers/v4.8.0/model_doc/clip.html#flaxcliptextmodel
2.https://github.com/huggingface/transformers/blob/f91c16d270e5e3ff32fdb32ccf286d05c03dfa66/src/transformers/models/clip/modeling_clip.py#L759
"outputs= self.text_encoder(caption_tokens)"
type(outputs)= text_encoder <class 'transformers.modeling_outputs.BaseModelOutputWithPooling'>
len(outputs) = 2
outputs[0].shape = torch.Size([4, 77, 1024]) this is last_hidden_state
outputs[1].shape = torch.Size([4, 1024]) this is the pooler_output
According to the explication of refs 1, should be outputs[0].

beniz changed the title ~~G_prompt for cut_turbo for unaligned horse2zebra dataset~~ G_prompt for cut_turbo for dataset with single prompts Jun 21, 2024

beniz added data:datasets alg:cut api labels Jun 21, 2024

beniz reviewed Jun 21, 2024

View reviewed changes

wr0124 force-pushed the G_prompt branch from 80af8f5 to db72bb4 Compare June 24, 2024 09:40

wr0124 added 5 commits June 24, 2024 11:42

feat(ml): add G_prompt if the dataset has no prompt, and fix multi

1315279

batch_size

feat(ml): add documentation

c210099

feat(ml):use option save_config and prioritize prompt option

f2604e6

feat(ml): format

42f3600

feat(ml):verify multi batch prompt match x

99c673d

wr0124 force-pushed the G_prompt branch from db72bb4 to 99c673d Compare June 24, 2024 09:42

beniz reviewed Jun 24, 2024

View reviewed changes

feat(ml):change G_lr to 0.0001

2d9c20e

beniz changed the title ~~G_prompt for cut_turbo for dataset with single prompts~~ G_prompt for cut_turbo for dataset with single prompt Jun 26, 2024

feat(ml): align input for turbo

cdd6af2

beniz reviewed Jul 1, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

G_prompt for cut_turbo for dataset with single prompt #662

G_prompt for cut_turbo for dataset with single prompt #662

wr0124 commented Jun 20, 2024 •

edited

Loading

beniz Jun 21, 2024

beniz Jun 21, 2024

wr0124 Jun 24, 2024

beniz Jun 26, 2024

beniz Jun 24, 2024

beniz Jul 1, 2024

wr0124 Jul 3, 2024

		@@ -201,17 +201,14 @@ def forward(self, x, prompt):
		).input_ids.cuda()
		caption_enc = self.text_encoder(caption_tokens)[0]

G_prompt for cut_turbo for dataset with single prompt #662

Are you sure you want to change the base?

G_prompt for cut_turbo for dataset with single prompt #662

Conversation

wr0124 commented Jun 20, 2024 • edited Loading

beniz Jun 21, 2024

Choose a reason for hiding this comment

beniz Jun 21, 2024

Choose a reason for hiding this comment

wr0124 Jun 24, 2024

Choose a reason for hiding this comment

beniz Jun 26, 2024

Choose a reason for hiding this comment

beniz Jun 24, 2024

Choose a reason for hiding this comment

beniz Jul 1, 2024

Choose a reason for hiding this comment

wr0124 Jul 3, 2024

Choose a reason for hiding this comment

wr0124 commented Jun 20, 2024 •

edited

Loading