我正在训练最新版本的layoutLMv3模型,但在开始训练时trainer.train()
出现以下错误。请帮我解决它。我使用的是 v100 4 GPU:
RuntimeError Traceback (most recent call last)
/tmp/ipykernel_3844/4032920361.py in <module>
----> 1 trainer.train()
/data/anaconda3/envs/data/lib/python3.7/site-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
1417 resume_from_checkpoint=resume_from_checkpoint,
1418 trial=trial,
-> 1419 ignore_keys_for_eval=ignore_keys_for_eval,
1420 )
1421
/data/anaconda3/envs/data/lib/python3.7/site-packages/transformers/trainer.py in _inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
1655 tr_loss_step = self.training_step(model, inputs)
1656 else:
-> 1657 tr_loss_step = self.training_step(model, inputs)
1658
1659 if (
/data/anaconda3/envs/data/lib/python3.7/site-packages/transformers/trainer.py in training_step(self, model, inputs)
2348
2349 with self.compute_loss_context_manager():
-> 2350 loss = self.compute_loss(model, inputs)
2351
2352 if self.args.n_gpu > 1:
...
visual_bbox = visual_bbox.to(device).type(dtype)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
None
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)