近年來 TensorFlow and CNN 這類變態方法的出現，現今做影像偵測及分類比起以前傳統方法已容易許多(也更無腦?)，且 GPU 加速佔了舉足輕重的角色，像文中使用的環境 CPU 及 GPU 運算 FPS 可相差了 7 倍之多，除此之外，NAS 由於儲存空間特大，很適合存放訓練資料，若能進一步直接在上面做運算，可省下不少時間及金錢。

本文將在 QNAP NAS TS-1685 在 Linux Station - Ubuntu 16.04 配上 NVIDIA GeForce GTX 1060 上，使用 darkflow 加上已訓練好的 yolo model 做即時影像偵測，成果如下，它會將 webcam 資料讀進來分析，影片中，會依序出現，人、剪刀、搖控器，三樣東西一起出現。

開工

進入 Ubuntu shell

個人習慣是全都用終端機，不用任何圖型化界面，但要從 NAS 進入 Ubuntu 終端機比較麻煩點，ssh 進入 NAS 後，可用下列指令

[~] # lxc-attach -n ubuntu_1604 -P `getcfg ubuntu-hd Install_Path -f /etc/config/qpkg.conf`/lxc -- sudo -u admin -i

To run a command as administrator (user "root"), use "sudo ".

See "man sudo_root" for details.

admin@ubuntu_1604:~$

便能看到 Ubuntu 16.04 用 admin 登入的 prompt。接下來都將用 $ 取代原本的 prompt。

更新系統

$ sudo apt update

$ sudo apt upgrade

安裝 Python 環境

$ sudo apt install -y git
$ git clone https://github.com/pyenv/pyenv.git ~/.pyenv
$ echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
$ echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
$ echo 'eval "$(pyenv init -)"' >> ~/.bashrc

重新進入 shell

$ git clone https://github.com/pyenv/pyenv-virtualenv.git $(pyenv root)/plugins/pyenv-virtualenv
$ echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bashrc

重新進入 shell

接著安裝 Python 3.6.3 及預設使用 virtualenv darkflow

$ sudo apt-get install -y libbz2-dev libreadline-dev libssl-dev libsqlite3-dev

$ pyenv install 3.6.3
$ pyenv shell 3.6.3
$ pyenv virtualenv darkflow
$ pyenv global darkflow

重新進入 shell 後，便能看到以下這樣的 prompt。

(darkflow) admin@ubuntu_1604:~$

安裝 darkflow

$ git clone https://github.com/thtrieu/darkflow
$ cd darkflow
$ pip install Cython numpy opencv-python tensorflow
$ pip install -e .

編譯 opencv

正常 Python 環境下是不用重編 opencv，但是我們用了特殊版本的 Python，Python 在讀取 cv2.so 會異常，而且沒有明顯的錯誤訊息，這步做就對了，不要問太多。

$ mkdir ~/opencv
$ curl https://github.com/opencv/opencv/archive/3.3.1.zip > ~/opencv/3.3.1.zip
$ cd ~/opencv/

$ unzip 3.3.1.zip
$ cd opencv*
$ mkdir build
$ cd build
$ sudo apt install -y cmake libgtk2.0-dev
$ export PREFIX_MAIN=`pyenv virtualenv-prefix` PREFIX=`pyenv prefix`

$ cmake -D CMAKE_BUILD_TYPE=RELEASE \
-D CMAKE_INSTALL_PREFIX="$PREFIX" \
-D PYTHON3_LIBRARY="$PREFIX_MAIN"/lib/libpython*m.a \
-D PYTHON3_INCLUDE_DIRS="$PREFIX_MAIN"/include/python*m \
-D PYTHON3_EXECUTABLE="$PREFIX"/bin/python3 \
-D PYTHON3_PACKAGES_PATH="$PREFIX"/lib/python*/site-packages/ \
-D PYTHON3_NUMPY_INCLUDE_DIRS="$PREFIX"/lib/python3.6/site-packages/numpy/core/include \
-D BUILD_opencv_python3=ON \
..
$ make -j`grep processor /proc/cpuinfo| wc -l`
$ make install
$ cp lib/python3/cv2.cpython-*-linux-gnu.so $PREFIX/lib/python*/site-packages/cv2/

使用 CPU 測試

下載訓練好的資料

$ cd ~/darkflow

$ mkdir bin
$ curl https://pjreddie.com/media/files/yolo.weights > ~/darkflow/bin/yolo.weights

測試

$ flow --model cfg/yolo.cfg --load bin/yolo.weights --demo camera --saveVideo

編譯 TensorFlow

重新編譯 TensorFlow 是為了支援此 NAS 支援的 CPU 指令集 (SSE4) 及 GPU。首先需先

確定 QTS 已安裝 NVIDIA Driver

進入 Ubuntu shell 安裝跟 NAS 一樣的 NVIDIA driver

$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt update
$ sudo apt install -y nvidia-381

安裝 CUDA

需先下載安裝檔，參考此網址

https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run

安裝過程中，務必選擇不要安裝 NVIDIA driver，整個流程如下：

$ sudo sh cuda_*_linux.run

>>>
Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?
(y)es/(n)o/(q)uit: n

Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
[ default is /usr/local/cuda-8.0 ]:

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
[ default is /home/admin ]:

將執行檔放入 PATH
echo 'export PATH=/usr/local/cuda-8.0/bin:$PATH' >> ~/.bashrc

將執行檔放入 PATH

$ echo 'export PATH=/usr/local/cuda-8.0/bin:$PATH' >> ~/.bashrc

安裝 cuDNN

需先下載 cudnn (要註冊) https://developer.nvidia.com/rdp/cudnn-download

詳細流程可參考 http://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#installlinux-tar

$ sudo apt install -y libcupti-dev openjdk-8-jdk

$ mkdir cudnn

$ tar xvf cudnn*.tgz -C cudnn/

$ cd cudnn

$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

$ echo /usr/local/cuda-8.0/lib64/ | sudo tee /etc/ld.so.conf.d/cudnn.conf

$ sudo ldconfig

安裝 bazel

$ echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
$ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
$ sudo apt-get update && sudo apt-get install bazel

下載 TensorFlow

$ git clone https://github.com/tensorflow/tensorflow

$ cd tensorflow

$ git checkout -b r1.4 origin/r1.4

設定 TensorFlow 編譯參數

很多 Y/N 要填，特別要注意 CUDA 要寫 Y，預設是 N。

$ ./configure

Extracting Bazel installation...
You have bazel 0.7.0 installed.
Please specify the location of python. [Default is /home/admin/.pyenv/versions/darkflow/bin/python]:

Found possible Python library paths:
/home/admin/.pyenv/versions/darkflow/lib/python3.6/site-packages
Please input the desired Python library path to use. Default is [/home/admin/.pyenv/versions/darkflow/lib/python3.6/site-packages]

Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]:
jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]:
Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [Y/n]:
Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with GDR support? [y/N]: n
No GDR support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]: n
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]:

Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 7.0.4

Please specify the location where cuDNN 7.0.4 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,5.2]

Do you want to use clang as CUDA compiler? [y/N]:
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:

Do you wish to build TensorFlow with MPI support? [y/N]:
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:

Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.
Configuration finished

編譯 TensorFlow

要編很久...

$ bazel build --config=cuda -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 //tensorflow/tools/pip_package:build_pip_package

編譯及安裝 Python module

$ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

$ pip uninstall tensorflow

$ pip install /tmp/tensorflow_pkg/tensorflow-1.4.*-linux_x86_64.whl

使用 GPU 測試

$ flow --model cfg/yolo.cfg --load bin/yolo.weights --demo camera --saveVideo --gpu 0.5 --threshold 0.5

可參考本文開頭影片。

附註：在第一次測試時，發生 CUDA init 失敗，仔細看是 Container 內沒有 /dev/nvidia-uvm，但 Host 有，之後重開 Ubuntu 就好了。

若是自己玩，可以調整一下上述 gpu and threshold 參數，若發生 GPU OOM(out of memory) 可把 GPU 參數調低或重開 Ubuntu，若辨識到太不準，可調整 threshold，其他還有一些參數可調整的，請參考官網。

後記

每次都是半夜整理這些東西，很累很累...

Doro One Two Three

2017/12/20

在 NAS - Ubuntu 16.04 上使用 TensorFlow 做即時影像偵測及分類

開工

進入 Ubuntu shell

安裝 Python 環境

安裝 darkflow

編譯 opencv

使用 CPU 測試

編譯 TensorFlow

安裝 CUDA

安裝 cuDNN

安裝 bazel

下載 TensorFlow

設定 TensorFlow 編譯參數

編譯 TensorFlow

編譯及安裝 Python module

使用 GPU 測試

後記

1 comment:

About Me

Labels

Blog Archive