[24/12/10] Implementation error on internvideo2_clip.py #218

junia3 · 2024-12-10T07:39:46Z

Hello,

I noticed an issue with handling base_layer named parameters in the load text_encoder logic of your code.
The current implementation may not correctly map keys such as q_proj or v_proj that require the addition of .base_layer. under certain conditions. Here’s a revised version of the code that corrects this issue and ensures proper handling.

Current code in internvideo2_clip.py load_checkpoint.

# load text_encoder
logger.info(f"Load text_encoder checkpoint from {text_ckpt_path}")
text_ckpt = torch.load(text_ckpt_path, map_location='cpu')
if 'module' in text_ckpt.keys():
    text_ckpt = text_ckpt['module']
for k, v in text_ckpt.items():
    if k.startswith('transformer.') or k == 'text_projection':
        new_k = "text_encoder." + k
    else:
        continue
    new_ckpt[new_k] = v

Suggested code revision:

# load text_encoder
logger.info(f"Load text_encoder checkpoint from {text_ckpt_path}")
text_ckpt = torch.load(text_ckpt_path, map_location='cpu')
if 'module' in text_ckpt.keys():
    text_ckpt = text_ckpt['module']
for k, v in text_ckpt.items():
    if k.startswith('transformer.') or k == 'text_projection':
        new_k = "text_encoder." + k
        if ("q_proj" in k or "v_proj" in k) and "lora" not in k:
            # Correct handling of base_layer
            new_k = new_k.replace(".q_proj", ".q_proj.base_layer").replace(".v_proj", ".v_proj.base_layer")
    else:
        continue
    new_ckpt[new_k] = v

Explanation of the Change:

•	The current implementation may generate incorrect mappings for keys like q_proj and v_proj when appending .base_layer..
•	The revised code ensures proper replacement of these key segments using .replace() to avoid errors.

Benefit:

•	Fixes errors in handling base_layer named parameters.
•	Ensures robust processing of keys, reducing the likelihood of bugs in models that depend on this checkpoint-loading logic.

Request:

Please review the suggested changes. If this aligns with your expectations, I’d be happy to submit a Pull Request for the fix or assist further in any way required.

Best regards,

If you have any specific issue on this request, please mail to [email protected]. Thank you!!

yinanhe · 2024-12-11T06:34:16Z

Hi, thank you for your contribution. I think this is caused by the inconsistent version of peft. When using a higher version of peft (such as 0.13.0) to initialize llama, .base_layer. will be added to the parameter's name, while a lower version such as the0.5.0 versionwill not. Our checkpoints are trained with a lower version of peft, so this problem is caused. If it is convenient, please take a look at your peft version. It may be better to distinguish whether to replace the key by the version of peft.

junia3 · 2024-12-11T07:01:51Z

Thank you for your response regarding PR!

[24/12/10] Implementation error on internvideo2_clip.py

cf8073e

junia3 mentioned this pull request Dec 10, 2024

[InternVideo2-CLIP] Fix and Recommendation for base_layer Named Parameters in load text_encoder Logic #217

Open

leexinhao requested a review from Andy1621 December 11, 2024 05:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[24/12/10] Implementation error on internvideo2_clip.py #218

[24/12/10] Implementation error on internvideo2_clip.py #218

junia3 commented Dec 10, 2024

yinanhe commented Dec 11, 2024

junia3 commented Dec 11, 2024

[24/12/10] Implementation error on internvideo2_clip.py #218

Are you sure you want to change the base?

[24/12/10] Implementation error on internvideo2_clip.py #218

Conversation

junia3 commented Dec 10, 2024

yinanhe commented Dec 11, 2024

junia3 commented Dec 11, 2024