Ubuntu 16.04下1.5版TensorFlow-gpu升级记录
使用pip升级TensorFlow时,发现TF已经升级至1.5版本。以下记录更新步骤 系统信息: 1
2Ubuntu 16.04 LTS x86_64
Python 3.5.4 :: Anaconda custom (64-bit)pip install -U
升级,会出现futures的错误: 1
2
3Collecting futures>=3.1.1 (from tensorflow-tensorboard<1.6.0,>=1.5.0->tensorflow)
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1f/9e/7b2ff7e965fc654592269f2906ade1c7d705f1bf25b7d469fa153f7d19eb/futures-3.2.0.tar.gz
Unknown requires Python '>=2.6, <3' but the running Python is 3.5.41
2pip install futures==3.1.1
pip install tensorflow-gpu==1.5.0libcublas.so.9.0
: 1
2import tensorflow as tf
print(tf.__version__)1
2ImportError: libcublas.so.9.0: cannot open shared object file:
No such file or directory
2、更新cuda 9
和cudnn 7
(1) 下载以下两个文件至本地:
cuda-repo-ubuntu1604_9.0.176-1_amd64.deb 7fa2af80.pub
(2) 执行以下两个命令:
1 | sudo dpkg -i cuda-repo-ubuntu1604_9.0.176-1_amd64.deb |
nvidia文档说明如下: 1
2sudo dpkg -i cuda-repo-<distro>_<version>_<architecture>.deb
sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub
(3) 给apt-get设置代理:
网址http://developer.download.nvidia.com/
无法通过ipv6访问,设置代理: 1
sudo vi /etc/apt/apt.conf
1
2Acquire::http::Proxy "http://127.0.0.1:8122";
Acquire::https::Proxy "http://127.0.0.1:8122";
(4) 执行更新动作
1 | sudo apt-get update |
(5) 列出需要更新的软件版本
1 | sudo apt-cache policy cuda |
命令格式: 1
sudo apt-cache policy <package name>
1
2
3
4
5
6
7
8
9
10
11
12
13
14cuda:
已安装:8.0.61-1
候选: 9.1.85-1
版本列表:
9.1.85-1 500
500 http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64 Packages
9.0.176-1 500
500 http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64 Packages
100 /var/lib/dpkg/status
*** 8.0.61-1 500
500 http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64 Packages
8.0.44-1 500
500 http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64 Packages
(6) 选取指定的版本
1 | sudo apt-get update cuda=9.0.176-1 |
(7) 创建软链接并验证安装
创建软链接: 1
2cd /usr/local/
sudo ln -s cuda-9.0 cuda~/.bashrc
中添加): 1
2
3
4export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cudanvidia-smi
,输出: 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17Sun Feb 4 11:36:36 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.12 Driver Version: 390.12 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:03:00.0 Off | N/A |
| 0% 19C P5 26W / 250W | 0MiB / 11176MiB | 2% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+nvcc -V
,输出: 1
2
3
4nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.1761
2
3
4
5
6wget -c http://developer.download.nvidia.com/compute/machine-learning/cudnn/secure/v7.0.5/prod/9.0_20171129/cudnn-9.0-linux-x64-v7.tgz
tar -zxvf cudnn-9.0-linux-x64-v7.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/ -d
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
(9) 更新完成后,恢复原有的配置
去除apt-get
代理: 1
sudo mv /etc/apt/apt.conf /etc/apt/apt.conf.with_proxy
nvidia
更新源,将cuda.list
的内容注释掉: 1
sudo vi /etc/apt/sources.list.d/cuda.list
(10) 如果以后需要更新,则将第(9)步的内容恢复即可。
参考文档
[1] Failed install on Windows
[2] nvidia文档
[3] Configure proxy for APT?
[4] How to install specific version of some package
[5] 深度学习服务器环境配置