我将 TensorFlow 配置为在 GPU (GeForce 840M) 上支持 CUDA,但程序运行速度相当慢slow与我之前使用的 CPU 相比。还有,我do not收到任何类型的消息某某CUDA库打开成功当我运行程序时。相反,这是我运行任何张量流程序时在日志中得到的内容:
python Neuralnet.py
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting /tmp/data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
2017-03-28 07:53:57.979382: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use SSE4.1 instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:57.979413: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use SSE4.2 instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:57.979431: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use AVX instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:57.979438: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use AVX2 instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:57.979447: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use FMA instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:58.233876: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901]
successful NUMA node read from SysFS had negative value (-1),
but there must be at least one NUMA node, so returning NUMA node zero
2017-03-28 07:53:58.234333: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887]
Found device 0 with properties:
name: GeForce 840M
major: 5 minor: 0 memoryClockRate (GHz) 1.124
pciBusID 0000:08:00.0
Total memory: 1.96GiB
Free memory: 1.75GiB
2017-03-28 07:53:58.234362: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0
2017-03-28 07:53:58.234372: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0: Y
2017-03-28 07:53:58.234388: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977]
Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 840M, pci bus id: 0000:08:00.0)
('Epoch', 0, 'completed out of', 15, 'loss:', 115374329.04653475)
就这样,程序开始运行,但它并没有按照我的预期运行得更快。我从官方文档安装了 CUDA,但我没有重置 git master head,因为它会产生问题,并且我使用了提供的相同优化标志bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
通过 bazel 构建时。
您是否使用 nvidia-smi 来判断您是否安装了正确的 cuda 驱动程序以及您的 GPU 对系统是否可见?
在 TF 中您可以设置日志设备放置 https://www.tensorflow.org/tutorials/using_gpu#logging_device_placement用于了解是否有任何操作被分配给 GPU 的选项。
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)