先报错没有指定文件
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory
![请添加图片描述](https://img-blog.csdnimg.cn/cddf1f5d5dc94af2840da0a65d241d6c.png)
在https://huggingface.co/搜索下载后,载入预训练模型时Pytorch遇到权重不匹配的问题
raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for PegasusForConditionalGeneration:
size mismatch for final_logits_bias: copying a param with shape torch.Size([1, 96103]) from checkpoint, the shape in current model is torch.Size([1, 21128]).
size mismatch for model.shared.weight: copying a param with shape torch.Size([96103, 1024]) from checkpoint, the shape in current model is torch.Size([21128, 768]).
size mismatch for model.encoder.embed_tokens.weight: copying a param with shape torch.Size([96103, 1024]) from checkpoint, the shape in current model is torch.Size([21128, 768]).
size mismatch for model.encoder.embed_positions.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 768]).
size mismatch for model.encoder.layers.0.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 768]).
![请添加图片描述](https://img-blog.csdnimg.cn/33f44681277e447996cb8ef5c109c690.png)
百度主要两个原因:
1、现为CPU,但加载了原先GPU训练的pkl
2、代码原因
首先排除代码因素,然后查看gpu状态
import torch
print(torch.cuda.is_available())
![在这里插入图片描述](https://img-blog.csdnimg.cn/4626ee310d804af0bc23fb459aedc499.png)
排除以上两个原因后,尝试删除.pkl缓存文件,重新生成.pkl文件
![在这里插入图片描述](https://img-blog.csdnimg.cn/30992a477db440acb602d47bd1d4f604.png)
依旧报错
请教学弟,
可能原因:
arg.那个值,可能用base model初始化了一个large model,所以参数矩阵对不上
解决方案1:在config里面修改参数
确认了没问题
解决方案2:pytorch model文件大了
下载的是large model ,但初始化用的是base model
![请添加图片描述](https://img-blog.csdnimg.cn/12bb3367aca245be8146970c7faead9a.png)
重新下载小点的文件
解决啦hhh