Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistake in the preparation of vicuna weights (error when loading delta weights) #52

Open
huangzhongzhong opened this issue Apr 19, 2023 · 7 comments

Comments

@huangzhongzhong
Copy link

I run the script to get the vicuna weights and get the following error:

python -m fastchat.model.apply_delta --base I:\chatgpt\minigpt4\MiniGPT-4\llama-13b-hf --target I:\chatgpt\minigpt4\MiniGPT-4\model --delta I:\chatgpt\minigpt4\MiniGPT-4\vicuna-13b-delta-v0

image

@huangzhongzhong
Copy link
Author

'model.layers.18.mlp.gate_proj.weight',` 'model.layers.13.mlp.down_proj.weight', 'model.layers.18.self_attn.q_proj.weight', 'model.layers.39.self_attn.o_proj.weight', 'model.layers.17.mlp.up_proj.weight', 'model.layers.24.self_attn.q_proj.weight', 'model.layers.2.post_attention_layernorm.weight', 'model.layers.17.mlp.down_proj.weight', 'model.layers.27.mlp.down_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\huang.conda\envs\minigpt4\lib\runpy.py:197 in _run_module_as_main │
│ │
│ 194 │ main_globals = sys.modules["main"].dict
│ 195 │ if alter_argv: │
│ 196 │ │ sys.argv[0] = mod_spec.origin │
│ ❱ 197 │ return _run_code(code, main_globals, None, │
│ 198 │ │ │ │ │ "main", mod_spec) │
│ 199 │
│ 200 def run_module(mod_name, init_globals=None, │
│ │
│ C:\Users\huang.conda\envs\minigpt4\lib\runpy.py:87 in _run_code │
│ │
│ 84 │ │ │ │ │ loader = loader, │
│ 85 │ │ │ │ │ package = pkg_name, │
│ 86 │ │ │ │ │ spec = mod_spec) │
│ ❱ 87 │ exec(code, run_globals) │
│ 88 │ return run_globals │
│ 89 │
│ 90 def _run_module_code(code, init_globals=None, │
│ │
│ C:\Users\huang.conda\envs\minigpt4\lib\site-packages\fastchat\model\apply_delta.py:153 in │
│ │
│ │
│ 150 │ if args.low_cpu_mem: │
│ 151 │ │ apply_delta_low_cpu_mem(args.base_model_path, args.target_model_path, args.delta │
│ 152 │ else: │
│ ❱ 153 │ │ apply_delta(args.base_model_path, args.target_model_path, args.delta_path) │
│ 154 │
│ │
│ C:\Users\huang.conda\envs\minigpt4\lib\site-packages\fastchat\model\apply_delta.py:124 in │
│ apply_delta │
│ │
│ 121 │ print(f"Loading the base model from {base_model_path}") │
│ 122 │ base = AutoModelForCausalLM.from_pretrained( │
│ 123 │ │ base_model_path, torch_dtype=torch.float16, low_cpu_mem_usage=True) │
│ ❱ 124 │ base_tokenizer = AutoTokenizer.from_pretrained( │
│ 125 │ │ base_model_path, use_fast=False) │
│ 126 │ │
│ 127 │ print(f"Loading the delta from {delta_path}") │
│ │
│ C:\Users\huang.conda\envs\minigpt4\lib\site-packages\transformers\models\auto\tokenization_auto │
│ .py:689 in from_pretrained │
│ │
│ 686 │ │ │ │ tokenizer_class = tokenizer_class_from_name(tokenizer_class_candidate) │
│ 687 │ │ │ │
│ 688 │ │ │ if tokenizer_class is None: │
│ ❱ 689 │ │ │ │ raise ValueError( │
│ 690 │ │ │ │ │ f"Tokenizer class {tokenizer_class_candidate} does not exist or is n │
│ 691 │ │ │ │ ) │
│ 692 │ │ │ return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *input │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported.
(minigpt4) PS I:\chatgpt\minigpt4\MiniGPT-4>

@gch8295322
Copy link

Change "tokenizer_class": "LLaMATokenizer" in llama-13b-hf/tokenizer_config.json into "tokenizer_class": "LlamaTokenizer". It worked for me~

@huangzhongzhong
Copy link
Author

I have observed that a few seconds before the error occurred, the memory usage suddenly spiked to 60GB out of my total 64GB memory. I suspect this issue might be related to the memory consumption. Could you please provide some guidance or suggestions on how to handle this situation? Thank you in advance. @gch8295322
image
image

@gch8295322
Copy link

I have ever seen this issue in here, see that if this can help you?

@huangzhongzhong
Copy link
Author

Dear @gch8295322

Thank you for your help earlier. I have prepared the model, but I am still encountering the "TypeError: argument of type 'WindowsPath' is not iterable" issue. I noticed that this problem is also being discussed in #28. I would like to ask if there is a solution to this issue at the moment? Once again, thank you for your assistance.

Best regards

image
image

@Wenbobobo
Copy link

Wenbobobo commented Apr 20, 2023

Change "tokenizer_class": "LLaMATokenizer" in llama-13b-hf/tokenizer_config.json into "tokenizer_class": "LlamaTokenizer". It worked for me~

I still met this issue and I chacked it's "Llama"
(in Windows terminal)
image
image

@LikeGiver
Copy link

LikeGiver commented Apr 22, 2023

@Wenbobobo
that's wield, I changed the class name and it works, maybe you should just reboot the terminal?
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants