-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
自定义数据集微调internlm2_5_7b_chat注意力shape报错 #996
Comments
02/27 03:06:07 - mmengine - INFO - before_train in EvaluateChatHook. |
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/xtuner/tools/train.py", line 360, in
[rank0]: main()
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/xtuner/tools/train.py", line 356, in main
[rank0]: runner.train()
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1200, in train
[rank0]: model = self.train_loop.run() # type: ignore
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/loops.py", line 273, in run
[rank0]: self.runner.call_hook('before_train')
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1271, in call_hook
[rank0]: getattr(hook, fn_name)(self, **kwargs)
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/xtuner/engine/hooks/evaluate_chat_hook.py", line 234, in before_train
[rank0]: self._generate_samples(runner, max_new_tokens=50)
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/xtuner/engine/hooks/evaluate_chat_hook.py", line 223, in _generate_samples
[rank0]: self._eval_language(runner, model, device, max_new_tokens,
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/xtuner/engine/hooks/evaluate_chat_hook.py", line 181, in _eval_language
[rank0]: generation_output = model.generate(
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/peft/peft_model.py", line 1491, in generate
[rank0]: outputs = self.base_model.generate(*args, **kwargs)
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/generation/utils.py", line 2223, in generate
[rank0]: result = self._sample(
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/generation/utils.py", line 3214, in _sample
[rank0]: outputs = model_forward(**model_inputs, return_dict=True)
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/home/xmyu/.cache/huggingface/modules/transformers_modules/internlm2_5-7b-chat/modeling_internlm2.py", line 1215, in forward
[rank0]: outputs = self.model(
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/home/xmyu/.cache/huggingface/modules/transformers_modules/internlm2_5-7b-chat/modeling_internlm2.py", line 1010, in forward
[rank0]: layer_outputs = decoder_layer(
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/home/xmyu/.cache/huggingface/modules/transformers_modules/internlm2_5-7b-chat/modeling_internlm2.py", line 744, in forward
[rank0]: hidden_states, self_attn_weights, present_key_value = self.attention(
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/home/xmyu/anaconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/home/xmyu/.cache/huggingface/modules/transformers_modules/internlm2_5-7b-chat/modeling_internlm2.py", line 343, in forward
[rank0]: attn_weights = attn_weights + causal_mask
[rank0]: RuntimeError: The size of tensor a (41) must match the size of tensor b (40) at non-singleton dimension 3
[rank0]:[W227 03:06:08.848877297 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator())
打印了关键张量发现,再第一轮sample batch_size=32后,再继续训练seq_length 变为1,导致上述attention计算异常,是否是版本不匹配?torch=2.5.1, transformers=4.49.0
The text was updated successfully, but these errors were encountered: