Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to inference a .nemo file which is converted from a HuggingFace format? #11478

Open
zhaoyang-star opened this issue Dec 5, 2024 · 1 comment

Comments

@zhaoyang-star
Copy link

1. HF -> nemo

Convert Qwen2.5-7B to Qwen2.5-7B.nemo. After tar xvf, the Qwen2.5-7B.nemo is as following:

root@inp16075348439349544319-3-1:/mnt/tenant-home_speed/nemo_models# tar xvf Qwen2.5-7B.nemo 
./
./model_config.yaml
./model_weights/
./model_weights/.metadata
./model_weights/__0_0.distcp
./model_weights/__0_1.distcp
./model_weights/common.pt
./model_weights/metadata.json

2. Inference using Qwen2.5-7B.nemo

Then I tried to do inference using Qwen2.5-7B.nemo.

python3 megatron_gpt_eval.py \
            gpt_model_file=Qwen2.5-7B.nemo \
            inference.greedy=True \
            inference.add_BOS=True \
            trainer.devices=1 \
            trainer.num_nodes=1 \
            tensor_model_parallel_size=1 \
            pipeline_model_parallel_size=1 \
            prompts='["who are you?", "What is the captial of China?"]'

The response seems wrong. Part of the output:

[NeMo I 2024-12-04 21:57:07 nlp_overrides:1386] Model MegatronGPTModel was successfully restored from Qwen2.5-7B.nemo.
prompt=========:['who are you?', 'What is the captial of China?']
setting number of microbatches to constant 1
***************************
{'sentences': ['who are you?1000000000000000000000000000000000', 'What is the captial of China?100000000000000000000000000000'], 'tokens': [['<|im_start|>', 'who', 'Ġare', 'Ġyou', '?', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'], ['<|im_start|>', 'What', 'Ġis', 'Ġthe', 'Ġcapt', 'ial', 'Ġof', 'ĠChina', '?', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']], 'logprob': None, 'full_logprob': None, 'token_ids': [[151644, 14623, 525, 498, 30, 16, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15], [151644, 3838, 374, 279, 6427, 530, 315, 5616, 30, 16, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15]], 'offsets': [[0, 0, 3, 7, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45], [0, 0, 4, 7, 11, 16, 19, 22, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58]]}
***************************
[NeMo I 2024-12-04 21:57:14 megatron_gpt_model:1717] Pipeline model parallel rank: 0, Tensor model parallel rank: 0, Number of model parameters on device: 7.62e+09. Number of precise model parameters on device: 7615616512.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
    
Predicting DataLoader 0:   0%|                                                                                                                                          | 0/1 [00:00<?, ?it/s]setting number of microbatches to constant 1
Predicting DataLoader 0: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  0.51it/s]
***************************
[{'sentences': ['who are you?1000000000000000000000000000000000', 'What is the captial of China?100000000000000000000000000000'], 'tokens': [['<|im_start|>', 'who', 'Ġare', 'Ġyou', '?', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'], ['<|im_start|>', 'What', 'Ġis', 'Ġthe', 'Ġcapt', 'ial', 'Ġof', 'ĠChina', '?', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']], 'logprob': None, 'full_logprob': None, 'token_ids': [[151644, 14623, 525, 498, 30, 16, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15], [151644, 3838, 374, 279, 6427, 530, 315, 5616, 30, 16, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15]], 'offsets': [[0, 0, 3, 7, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45], [0, 0, 4, 7, 11, 16, 19, 22, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58]]}]
***************************
@hhd52859
Copy link

same problem with llama3.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants