Huggingface Bert TPU 微调适用于 Colab，但不适用于 GCP

2024-04-14

我正在尝试在 TPU 上微调 Huggingface Transformers BERT 模型。它在 Colab 中工作，但当我切换到 GCP 上的付费 TPU 时失败。 Jupyter笔记本代码如下：

[1] model = transformers.TFBertModel.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
# works
[2] cluster_resolver = tf.distribute.cluster_resolver.TPUClusterResolver(
    tpu='[My TPU]',
    zone='us-central1-a',
    project='[My Project]'
)
tf.config.experimental_connect_to_cluster(cluster_resolver)
tf.tpu.experimental.initialize_tpu_system(cluster_resolver)
tpu_strategy = tf.distribute.experimental.TPUStrategy(cluster_resolver)
#Also works. Got a bunch of startup messages from the TPU - all good.

[3] with tpu_strategy.scope():
    model = TFBertModel.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
#Generates the error below (long). Same line works in Colab.

这是错误消息：

NotFoundError                             Traceback (most recent call last)
<ipython-input-14-2cfc1a238903> in <module>
      1 with tpu_strategy.scope():
----> 2     model = TFBertModel.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    309             return load_pytorch_checkpoint_in_tf2_model(model, resolved_archive_file, allow_missing_keys=True)
    310 
--> 311         ret = model(model.dummy_inputs, training=False)  # build the network with dummy inputs
    312 
    313         assert os.path.isfile(resolved_archive_file), "Error retrieving file {}".format(resolved_archive_file)

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_bert.py in call(self, inputs, **kwargs)
    688 
    689     def call(self, inputs, **kwargs):
--> 690         outputs = self.bert(inputs, **kwargs)
    691         return outputs
    692 

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_bert.py in call(self, inputs, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, training)
    548 
    549         embedding_output = self.embeddings([input_ids, position_ids, token_type_ids, inputs_embeds], training=training)
--> 550         encoder_outputs = self.encoder([embedding_output, extended_attention_mask, head_mask], training=training)
    551 
    552         sequence_output = encoder_outputs[0]

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_bert.py in call(self, inputs, training)
    365                 all_hidden_states = all_hidden_states + (hidden_states,)
    366 
--> 367             layer_outputs = layer_module([hidden_states, attention_mask, head_mask[i]], training=training)
    368             hidden_states = layer_outputs[0]
    369 

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_bert.py in call(self, inputs, training)
    341         hidden_states, attention_mask, head_mask = inputs
    342 
--> 343         attention_outputs = self.attention([hidden_states, attention_mask, head_mask], training=training)
    344         attention_output = attention_outputs[0]
    345         intermediate_output = self.intermediate(attention_output)

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_bert.py in call(self, inputs, training)
    290         input_tensor, attention_mask, head_mask = inputs
    291 
--> 292         self_outputs = self.self_attention([input_tensor, attention_mask, head_mask], training=training)
    293         attention_output = self.dense_output([self_outputs[0], input_tensor], training=training)
    294         outputs = (attention_output,) + self_outputs[1:]  # add attentions if we output them

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

~/.local/lib/python3.5/site-packages/transformers/modeling_tf_bert.py in call(self, inputs, training)
    222 
    223         batch_size = shape_list(hidden_states)[0]
--> 224         mixed_query_layer = self.query(hidden_states)
    225         mixed_key_layer = self.key(hidden_states)
    226         mixed_value_layer = self.value(hidden_states)

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    820           with base_layer_utils.autocast_context_manager(
    821               self._compute_dtype):
--> 822             outputs = self.call(cast_inputs, *args, **kwargs)
    823           self._handle_activity_regularization(inputs, outputs)
    824           self._set_mask_metadata(inputs, outputs, input_masks)

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/keras/layers/core.py in call(self, inputs)
   1142         outputs = gen_math_ops.mat_mul(inputs, self.kernel)
   1143     if self.use_bias:
-> 1144       outputs = nn.bias_add(outputs, self.bias)
   1145     if self.activation is not None:
   1146       return self.activation(outputs)  # pylint: disable=not-callable

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/nn_ops.py in bias_add(value, bias, data_format, name)
   2756     else:
   2757       return gen_nn_ops.bias_add(
-> 2758           value, bias, data_format=data_format, name=name)
   2759 
   2760 

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/gen_nn_ops.py in bias_add(value, bias, data_format, name)
    675       try:
    676         return bias_add_eager_fallback(
--> 677             value, bias, data_format=data_format, name=name, ctx=_ctx)
    678       except _core._SymbolicException:
    679         pass  # Add nodes to the TensorFlow graph.

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/gen_nn_ops.py in bias_add_eager_fallback(value, bias, data_format, name, ctx)
    703     data_format = "NHWC"
    704   data_format = _execute.make_str(data_format, "data_format")
--> 705   _attr_T, _inputs_T = _execute.args_to_matching_eager([value, bias], ctx)
    706   (value, bias) = _inputs_T
    707   _inputs_flat = [value, bias]

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/eager/execute.py in args_to_matching_eager(l, ctx, default_dtype)
    265         dtype = ret[-1].dtype
    266   else:
--> 267     ret = [ops.convert_to_tensor(t, dtype, ctx=ctx) for t in l]
    268 
    269   # TODO(slebedev): consider removing this as it leaks a Keras concept.

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/eager/execute.py in <listcomp>(.0)
    265         dtype = ret[-1].dtype
    266   else:
--> 267     ret = [ops.convert_to_tensor(t, dtype, ctx=ctx) for t in l]
    268 
    269   # TODO(slebedev): consider removing this as it leaks a Keras concept.

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/framework/ops.py in convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, dtype_hint, ctx, accepted_result_types)
   1312 
   1313     if ret is None:
-> 1314       ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
   1315 
   1316     if ret is NotImplemented:

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/values.py in _tensor_conversion_mirrored(var, dtype, name, as_ref)
   1174 # allowing instances of the class to be used as tensors.
   1175 def _tensor_conversion_mirrored(var, dtype=None, name=None, as_ref=False):
-> 1176   return var._dense_var_to_tensor(dtype=dtype, name=name, as_ref=as_ref)  # pylint: disable=protected-access
   1177 
   1178 

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/values.py in _dense_var_to_tensor(self, dtype, name, as_ref)
    908     if _enclosing_tpu_context() is None:
    909       return super(TPUVariableMixin, self)._dense_var_to_tensor(
--> 910           dtype=dtype, name=name, as_ref=as_ref)
    911     # pylint: enable=protected-access
    912     elif dtype is not None and dtype != self.dtype:

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/values.py in _dense_var_to_tensor(self, dtype, name, as_ref)
   1164     assert not as_ref
   1165     return ops.convert_to_tensor(
-> 1166         self.get(), dtype=dtype, name=name, as_ref=as_ref)
   1167 
   1168   def _clone_with_new_values(self, new_values):

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/values.py in get(self, device)
    835   def get(self, device=None):
    836     if (_enclosing_tpu_context() is None) or (device is not None):
--> 837       return super(TPUVariableMixin, self).get(device=device)
    838     else:
    839       raise NotImplementedError(

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/values.py in get(self, device)
    320         device = distribute_lib.get_update_device()
    321         if device is None:
--> 322           return self._get_cross_replica()
    323     device = device_util.canonicalize(device)
    324     return self._device_map.select_for_device(self._values, device)

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/values.py in _get_cross_replica(self)
   1136     replica_id = self._device_map.replica_for_device(device)
   1137     if replica_id is None:
-> 1138       return array_ops.identity(self.primary)
   1139     return array_ops.identity(self._values[replica_id])
   1140 

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/util/dispatch.py in wrapper(*args, **kwargs)
    178     """Call target, and fall back on dispatchers if there is a TypeError."""
    179     try:
--> 180       return target(*args, **kwargs)
    181     except (TypeError, ValueError):
    182       # Note: convert_to_eager_tensor currently raises a ValueError, not a

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/array_ops.py in identity(input, name)
    265     # variables. Variables have correct handle data when graph building.
    266     input = ops.convert_to_tensor(input)
--> 267   ret = gen_array_ops.identity(input, name=name)
    268   # Propagate handle data for happier shape inference for resource variables.
    269   if hasattr(input, "_handle_data"):

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/gen_array_ops.py in identity(input, name)
   3824         pass  # Add nodes to the TensorFlow graph.
   3825     except _core._NotOkStatusException as e:
-> 3826       _ops.raise_from_not_ok_status(e, name)
   3827   # Add nodes to the TensorFlow graph.
   3828   _, _, _op, _outputs = _op_def_library._apply_op_helper(

/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/framework/ops.py in raise_from_not_ok_status(e, name)
   6604   message = e.message + (" name: " + name if name is not None else "")
   6605   # pylint: disable=protected-access
-> 6606   six.raise_from(core._status_to_exception(e.code, message), None)
   6607   # pylint: enable=protected-access
   6608 

/usr/local/lib/python3.5/dist-packages/six.py in raise_from(value, from_value)

NotFoundError: '_MklMatMul' is neither a type of a primitive operation nor a name of a function registered in binary running on n-aa2fcfb7-w-0. One possible root cause is the client and server binaries are not built with the same version. Please make sure the operation or function is registered in the binary running in this process. [Op:Identity]

我将其发布在 Huggingface github 上（https://github.com/huggingface/transformers/issues/2572 https://github.com/huggingface/transformers/issues/2572）并且他们建议 TPU 服务器版本可能与 TPU 客户端版本不匹配，但是 a）我不知道如何检查，也不知道 b）该怎么办。建议表示赞赏。

None

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

Huggingface Bert TPU 微调适用于 Colab，但不适用于 GCP 的相关文章

如何识别图形线条

我有以下格式的路径的 x y 数据示例仅用于说明 seq p1 p2 0 20 2 3 1 20 2 4 2 20 4 4 3 22 5 5 4 22 5 6 5 23 6 2 6 23 6 3 7 23 6 4 每条路径都有多个点它们
如何强制下载图片？

我的页面上有一个动态生成的图像如下所示 img src 我不想告诉我的用户右键单击图像并点击保存而是想公开一个下载链接单击该链接将提示下载图像如何实现这一目标最初我在 js 中尝试这样做 var path my image att
将我的免费应用程序从 Universal 升级到仅限 iPhone

我释放我的free app到 appStore 它的版本是 1 0 它是一个Universal app 现在我想发布 1 1 版本到 appStore 我将其升级到iPhone only appStore会拒绝我吗我已阅读类似的问题 ht
如何将 SQL“LIKE”与 LINQ to Entities 结合使用？

我有一个文本框允许用户指定搜索字符串包括通配符例如 Joh Johnson mit ack on 在使用 LINQ to Entities 之前我有一个存储过程该存储过程将该字符串作为参数并执行以下操作 SELECT FROM T
UI Router 将 url 与 hash（片段）相匹配

使用 UI 路由器我需要将 URL 与其中包含的哈希片段进行匹配 HTML5 模式 state myState url path id page section templateUrl template html controller
CUDA 添加矩阵的行

我试图将 4800x9600 矩阵的行加在一起得到一个 1x9600 的矩阵我所做的是将 4800x9600 分成 9 600 个矩阵每个矩阵长度为 4800 然后我对 4800 个元素进行缩减问题是这真的很慢有人有什么建议吗
AWK 错误：尝试在标量上下文中使用数组

我正在学习AWK 这是一个简单的代码片段我尝试将字符串拆分为数组并迭代它 BEGIN split a b c a for i 1 i lt length a i print a i 运行此代码时我收到以下错误 awk awk txt 4
如何使用placement new重新初始化该字段？

我的课程包含字段 private OrderUpdate curOrderUpdate 我一遍又一遍地使用它经常需要重新初始化 for int i 0 i lt entries size i auto entry entries i ne
突出显示单词并提取其附近文本的函数

我有一个文本例如 Etiam porta semmalesuada magna mollis euismod 整数取数 ante venenatis dapibus posuere velit aliquet 埃蒂亚姆门塔塞姆 male
[GoF]-ConcreteSubject 可以覆盖通知方法吗？

我正在模拟一种情况其中存在通知框观察者 list1 list2 list3 这个科目现在我会制作一张图表其中使用观察者模式描述每个列表实现不同类型的notify 这一事实例如列表状态的某些变化只需要按照某些标准通知给某些观察者
拉斐尔路径交叉点不起作用

我对拉斐尔和 pathIntersection method JSFiddle 示例 http jsfiddle net t6gWt 2 您可以看到有两条线都与曲线相交但当我使用 pathIntersection method 有一个未解
结构化绑定的用例有哪些？

C 17 标准引入了新的结构化绑定 http en cppreference com w cpp language structured binding功能最初是proposed http www open std org jtc1 sc
无法完成添加 Android 证书的构建

我刚刚完成构建我的应用程序我发送了一个没有证书的构建版本它工作了现在添加一个 android 证书它在我的代号 one 仪表板上报告构建错误如有帮助将不胜感激失败构建失败并出现异常出了什么问题执行任务失败 transf
对齐与未对齐 x86 SIMD 指令之间的选择

SIMD指令一般有两种类型 A 使用对齐的内存地址如果地址未在操作数大小边界上对齐则会引发一般保护 GP 异常 movaps xmm0 xmmword ptr rax vmovaps ymm0 ymmword ptr rax vmova
Jenkins 通过 ssh 发布显示错误“jenkins.plugins.publish_over.BapPublisherException：无法添加 SSH 密钥。”

为了使用 ssh 连接 jenkins 与远程服务器我在 jenkins 中安装了通过 SSH 发布的插件但配置后它显示错误为 jenkins plugins publish over BapPublisherException 无法
在并行包中的 R 的 par*apply 函数内部使用 Rcpp 函数

我试图了解背后发生的事情Rcpp sourceCpp 调用并行环境最近问题中部分解决了这个问题在 Windows 上使用 parLapply 中的 Rcpp 函数 https stackoverflow com questions 2
小部件配置在 macOS 上不起作用

我为我的 iOS 应用程序制作了一个小部件效果很好现在我正在将其移植到我的 macOS 应用程序中但不知何故小部件配置不起作用这些项目已显示但我无法以某种方式选择它们查看屏幕截图但请看一下我制作的视频 https youtu
React Native 0.61 中引入的快速刷新不起作用

也发表在https github com facebook react native issues 27583 https github com facebook react native issues 27583 更新一天过去了我再次
Ada 中的 In/Out 与 Out

我有一个简短的艾达问题如果我有一个程序may写出一个变量或者我might不用管它它应该是一个Out参数或In Out范围我想这可以归结为一个问题如果调用者调用参数如下的过程它会看到什么Out但该过程不触及参数它看到相同的值吗
在 Perl 中查找数组的大小

我似乎遇到过几种不同的方法来查找数组的大小这三种方法有什么区别呢 my arr 2 print scalar arr First way to print array size print arr Second way to print

随机推荐

阻止 eclipse CDT 从 main() 进行调试？

如果我使用 eclipse CDT 调试我的 C 代码它似乎总是从main 函数即使在开头没有断点main 有没有办法让 Eclipse CDT 从第一个断点开始调试而不是main 在菜单上运行 gt 调试配置右键单击C C 应用程序
在 Android 中开发 Web 监视器

我想监控过滤用户在 Android 中打开的网站我知道如何使用浏览器历史记录中的 ContentObserver 检索上次访问的 URL 在 Android 默认浏览器中 private static class BrowserObse
如何检测客户端线程是否退出？

这是一个有趣的图书馆作家的困境在我的库在我的例子中是 EasyNetQ 中我正在分配线程本地资源因此当客户端创建一个新线程然后调用我的库上的某些方法时就会创建新资源对于 EasyNetQ 当客户端在新线程上调用 Publis
node.js Date#getTime() 的作用是什么？

我现在正在研究 learnyounode 模块 13 在提示部分它声称 Date getTime 也会派上用场我查找了 Date 对象并找到了 getTime 方法但是当存在散列而不是句点时这意味着什么这只是一个参考getTime
无法加载 `Rails.application.database_configuration`：未知别名：默认

我是 Ruby on Rails 的新手我猜我的问题的答案非常简单但我找不到它我最近创建了一个项目并使用 railsgeneratescaffold 一切工作正常我想向数据库添加另一列因此我使用了 railsgeneratemig
ModuleNotFoundError：Heroku 中没有名为“django”的模块

我尝试在 Heroku 中部署我的应用程序并出现此错误 2018 05 03T14 35 40 682441 00 00 heroku web 1 Starting process with command python manage p
为什么编译器无法用文字确定 std::max 的模板？

既不是 clang 也不是 gcc 编译这个 include
我可以使用 Web Config Transform 而不使用 Visual Studio 2012 进行发布吗？

Visual Studio 2012 是否支持使用特定 Web config 转换运行解决方案而无需发布我们正在使用 web config 来更改发布时的客户端设置并希望在本地测试它们不太一样但你可以preview使用 Visua
Bradley-Roth 自适应阈值算法 - 如何获得更好的性能？

我有以下图像阈值代码使用 Bradley Roth 图像阈值方法 from PIL import Image import copy import time def bradley threshold image threshold 75
如何在没有 redis 的情况下扩展 socket.io

我目前正在寻找一种替代方案来使用 socket io 扩展我的 Express 应用程序问题是我不想使用 redis 作为 socket io 存储除了使用之外是否还有其他可能性来集群 socket io集群集线器 https git
已等待但从未解决/拒绝承诺内存使用[重复]

这个问题在这里已经有答案了 Will awaiting a Promise既不解决也不拒绝从不解决未实现导致内存泄漏在查看 React hooks 时我对此感到好奇slorber awesome debounce promise h
如何将我所有选定的列放入虚拟变量中？

背景这个问题是一个后续问题上一个问题 https stackoverflow com questions 45981422 how to measure query duration without showing results of
如何在 TreePanel 上拖放后触发事件

如何使用 Ext tree ViewDDPlugin 的事件我有一个使用 DDPplugin 的 TreePanel 但我想知道如何监听 drop 事件这就是我的代码的样子 var monPretree Ext create Ext t
GWT+Jetty JSP 编译器问题的解决方法？（Java 1.5源代码级别不被识别）

As 显示使用新的 Jetty 服务器在 GWT 托管模式下编译 JSP 似乎存在问题 2 ERROR in tmp Jetty 0 0 0 0 8080 war ut4fm1 jsp org apache jsp test jsp ja
如何在 GitLab CI 构建期间从私有 GitLab Git 存储库中提取 NPM 依赖项

我有一份工作 gitlab ci yml执行以下操作的文件npm install像这样 test image node 10 script npm install npm test 问题是我在我的项目中引用了一个私有的 GitLab 存储库
iOS 将音频采样率从 16 kHz 转换为 8 kHz

我尝试将 PCM 音频从 16kHz 转换为 8kHz 只是采样率没有格式更改流程看起来很简单但我不断得到kAudioConverterErr InvalidInputSize insz 来自呼叫AudioConverterFillC
将值写入PE文件

我想尝试以下操作我有一个 C 程序它将一个文件作为输入并计算这五个 MD5 的 MD5 算法我的算法对每个文件都有一个唯一的值该值是一个 128 位值因此我想使用此技术通过将 md5 算法的输出值保存到我的 PE 文件中来保护我的
使用 jq 将 Json 文件中的表格形式的元素相关联

我是新来的jq我有以下代码来获取每个名为的元素的值列表Abc Abc objects select has Abc Abc tsv 这是我得到的当前输出 Abc 4 2 1 9 3 2 4 9 我想在左侧添加 4 列以显示每列Abc值对应的
Mido - 如何从不同端口实时获取 midi 数据

我创建了 2 个端口作为输入用于从键盘和 midi 表面控制器有一堆滑块和旋钮捕获数据虽然我不确定如何从两者获取数据 for msg1 in input hw if not msg1 type clock print msg1 Pl
Huggingface Bert TPU 微调适用于 Colab，但不适用于 GCP

我正在尝试在 TPU 上微调 Huggingface Transformers BERT 模型它在 Colab 中工作但当我切换到 GCP 上的付费 TPU 时失败 Jupyter笔记本代码如下 1 model transformers

Huggingface Bert TPU 微调适用于 Colab，但不适用于 GCP

Huggingface Bert TPU 微调适用于 Colab，但不适用于 GCP 的相关文章

随机推荐

热门标签