Build tensorflow development environment based on docker

Time:2019-12-10

Reprint with my consent and original address: https://zhangmenghuan.github.i

Preface

The first time I heard the word docker was when I was looking for an internship two years ago, I participated in the front-end telephone interview of daocloud, and learned that this company focused on container technology had no concept of containers at that time, but I thought docker was a very high technology, and later I chose my own relatively good mobile development and went to dccloud. Two years have passed in a flash. Now, artificial intelligence and cloud computing are not only the tuyere, but also the infrastructure of many technologies.

This year, I plan to learn some of the most basic contents in the field of artificial intelligence, which will not be eliminated by the market in a few years. In this period of time, first learn about Google’s tensorflow framework, which is the so-called “if you want to work well, you must first make use of its tools”. In the first step, you must build a good development environment. Currently, tensorflow can be installed in the following ways:

  • Virtualenv
  • Pip
  • Docker
  • Anaconda
  • Install from source

Because docker may be played in depth in the future, docker is selected for tensorflow installation.

Docker Getting Started Guide

What is docker?

One of the biggest troubles in software development is environment configuration. The environment of user computers is different. How do you know your own software can run on those machines? Docker belongs to the encapsulation of Linux container, which provides a simple and easy-to-use container using interface. It is the most popular Linux container solution. Docker packages the dependencies of the application and the program into a file. Run this file and a virtual container will be generated. The program runs in this virtual container as if it were running on a real physical machine. With docker, you don’t have to worry about the environment. Overall, docker’s interface is quite simple. Users can easily create and use containers and put their own applications into containers. The container can also be used for version management, copying, sharing and modification, just like managing ordinary code.

Docker runs on operating system OS through docker engine, and virtual machine runs on hardware resources.

Generally speaking, docker is a “dock worker” who packs the goods (Applications) we need into containers (mirrors) with some standard specifications. During the deployment process of docker, the repeated parts such as installation and configuration are automatically completed. Only when the first deployment is completed, the available docker image (container loading) is built. In the later use, you can pull the image directly in a few lines of commands, create a container based on the image, and run the service. All you need is a server with docker installed, a dockerfile file (packing list), and a relatively smooth network. It’s really “build once, deploy everywhere.”

There are three main uses of docker.

  • Provide a one-time environment. For example, testing other people’s software locally, and providing unit testing and build environments for continuous integration.
  • Provide flexible cloud services. Because docker container can be opened and closed at any time, it is very suitable for dynamic expansion and contraction.
  • Build a microservice architecture. Through multiple containers, a machine can run multiple services, so the microservice architecture can be simulated locally.

Docker general architecture

Build tensorflow development environment based on docker

The docker system uses the C / S architecture. The docker client requests the docker daemon through the rest API to manage the image and container of the docker. The server side is stationed in the background, which is called docker daemon. The client is a cli program, which can interact with the docker binary on the command line.

Install docker

Docker is an open source commercial product with two versions: Community Edition (CE for short) and Enterprise Edition (EE for short).

Build tensorflow development environment based on docker

  • Official document: https://docs.docker.com/
  • Installation of docker CE in MAC environment: https://docs.docker.com/docker-for-mac/install/

After the installation is complete, run the following command to verify that the installation was successful.

$ docker --version
Docker version 18.03.0-ce, build 0520e24

$ docker-compose --version
docker-compose version 1.20.1, build 5d8c71b

$ docker-machine --version
docker-machine version 0.14.0, build 89b8332

Docker Registry

Docker remote image warehouse:

  • DockerHub:https://hub.docker.com/
  • DaoCloud:https://hub.daocloud.io/
  • Aliyun:https://dev.aliyun.com/search…

In the process of installing the environment, because of a great firewall project, most of the resources we need cannot be obtained smoothly. The solution is to replace the downloaded source with the image source provided by some domestic manufacturers.

Docker Image

Docker image (image) is a read-only template used to create a docker container. It contains all the configuration information and running programs required for the container to start. It can be reused multiple times after a build.

Only through this file can docker containers be generated. Docker generates container instances based on image files. The same image file can generate multiple container instances running at the same time. In the actual scenario, generally, the image we create depends on the image of a Linux operating system, such as Ubuntu. In most cases, we can call it the basic image, but we can also check the dockerfile of Ubuntu image and find that it also depends on an image called scratch, which is an empty image of docker and only docker If we want to pursue our own image as light as possible, we can also build scratch image as our basic image.

Image related commands:

#Grab image file from warehouse to local
docker image pull hello-world

#Lists all image files for this computer.
$ docker image ls

REPOSITORY          TAG            IMAGE ID            CREATED             SIZE
nginx               latest         ae513a47849c        3 weeks ago         109MB
hello-world         latest         e38bc07ac18e        6 weeks ago         1.85kB

#Delete image file
$ docker image rm [imageName]

Under Mac OS, which path is docker images saved in?

If you are using docker for Mac, all docker images are saved in the following file.

/Users/{YourUserName}/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/Docker.qcow2

Docker Container

Docker container contains our application code and code execution environment, which is used to package and distribute code. The container instance generated by image file is also a file itself, which is called container file.

#List the containers running on this machine, use docker container LS or docker PS
$ docker container ls
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                NAMES
b15f63d6e87a        nginx               "nginx -g 'daemon of…"   About an hour ago   Up About an hour    0.0.0.0:80->80/tcp   webserver

#List all containers on this machine, including the ones that terminate running, plus -- all get - A
$ docker container ls -a

#Stop the running container of this machine
$ docker container stop webserver

#Can delete a container
$ docker rm container_name/container_id

#Start a container
$ docker start container_name/container_id

#Terminate a container
$ docker stop container_name/container_id

#Execute / bin / bash in the container. After executing the command, you can operate the container in the way of interactive command line. In addition, / bin / bash can be replaced by any executable command
$ docker exec -it contaner_name /bin/bash

Create a container to execute the application code, using the run command:

docker run [OPTIONS] IMAGE [COMMAND] [ARG...]

For example, to create a container running nginx, you can execute:

docker run -d -p 80:80 --name webserver nginx

In a web browser, go tohttp://localhost/View nginx home page. Because we specify the default HTTP port, we do not need to append at the end of the URL:80

Build tensorflow development environment based on docker

Meaning of parameters:

  • -The D parameter indicates the running container of the background Daemons
  • –The name parameter indicates the name of the container, which can be selected at will
  • -V represents the mapping of the shared files between the host and the container. The directory of the container is the directory defined by the volume command in the dockerfile
  • -P table host and container port mapping. The container port is the port bound by the expose command in the dockerfile

Dockerfile

Dockerfile is used to explain how to automatically build the instruction set file of docker image. After the instruction set is written in the dockerfile, we can build the image through the docker build command. The order of the commands in the dockerfile file is the order of execution during the construction process.

Dockerfile reference:https://docs.docker.com/engin…

Here are some common instructions:

From: depends on the image. All dockerfiles must start with the from command, indicating the image they depend on.

FROM image_name

Run: command executed in shell or exec environment

RUN <command>

Add: copy the host file to the container

ADD /path/to/sourcefile/in/host /path/to/targetfile/in/container

CMD: Specifies the command that the container starts the default execution

CMD ["executable","param1","param2"]

Expose: Specifies the port on which the container listens at runtime

EXPOSE <port>

Workdir: Specifies the working directory for the run, CMD, and entrypoint commands

WORKDIR /path/to/workdir/in/container

Volume: authorize access to the directory from the container to the host

VOLUME ["/data"]

Hello World

Let’s run a hello world here, and open the command line terminal:

$ docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
ca4f61b1923c: Pull complete
Digest: sha256:ca0eeb6fb05351dfc8759c20733c91def84cb8007aa89a5bf606bc8b315b9fc7
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.
...

Here, the command line terminal outputsHello from Docker!, which completes our first docker instance. Here we use it directlydocker runTo create a container to execute the application code. If there is no Hello world image file locally, the image file will be pulled from the service first. The image files officially provided by docker are all in the library group, so it’s the default group. We can also use it firstdocker image pull hello-worldDownload the image file before running the container. Here we can also complete the above functions by writing the dockerfile file. The above Hello world image file can be viewed here: https://github.com/docker-lib.

Getting started with tensorflow

About tensorflow

Tensorflow? Is an open source software library for numerical calculation using data flow diagrams. The nodes in the graph represent mathematical operations, while the edges in the graph represent multidimensional arrays (tensors) passed between these nodes. With this flexible architecture, you can deploy computing work to one or more CPUs or GPUs in a desktop device, server, or mobile device through an API. Tensorflow was originally developed by researchers and engineers from the Google brain team (part of Google’s machine intelligence research department) for machine learning and deep neural network research. However, the system has good versatility and can be applied to many other fields.

Learning materials

  • Tensorflow official website
  • TensorFlow github
  • Tensorflow Chinese community
  • Tensorflow official document Chinese – geek Academy

What is a data flow graph?

Build tensorflow development environment based on docker

The data flow graph uses the directed graph of nodes and edges to describe the mathematical calculation. “Node” is generally used to represent the imposed mathematical operation, but it can also represent the start point / output end point of data input (feed in), or the end point of read / write persistent variable. Line represents the input / output relationship between nodes. These data “lines” can transport multidimensional data arrays with “size can be adjusted dynamically”, i.e. “tensors”. The intuitive image that the tensor flows through from the graph is why the tool is called tensorflow. Once all the tensors at the input are ready, the nodes will be assigned to various computing devices to perform asynchronous and parallel operations.

All calculations in tensorflow are converted to nodes on the calculation graph. Each node in the calculation graph can have any number of inputs and any number of outputs. Each node describes an operation (OP), which can be counted as an instance of the operation. The calculation chart describes the calculation flow of data, and it is also responsible for maintaining and updating the status. Users can control the branch of the calculation chart conditionally or circularly. Users can use pyton, C + +, go, Java and other languages to design calculation diagrams. Tensorflow runs all operations outside of Python through calculation diagram, such as running on CPU or GPU through C + + or CUDA, so Python is actually only an interface, and the real core calculation process is running on CPU or GPU through C + + or CUDA at the bottom.

A tensorflow diagram describes the calculation process. In order to perform the calculation, the diagram must be started in the session. The session distributes the op of the diagram to the standby such as CPU or GPU, and provides the methods to execute the op. after these methods are executed, the generated sensor will be returned. In Python, the returned sensor is the num day array object; in C and C + + languages, the returned sensor is tenso Rflow:: tensor instance.

From the above description, we can see several important concepts of tensorflow: sensor, calculation graph, node, session. As mentioned before, the whole operation is like data (sensor) flowing through nodes along the edge in the calculation graph, and then starting the calculation through the session. So in short, to complete the whole process, we need to define the data, the nodes on the calculation graph and the calculation graph, and start the calculation session. So most of the work we need to do in practice is to define these contents.

Using docker to install tensorflow

We have explained the basic concept of docker. Before reading this step, make sure you understand the basic process of creating a container. This section focuses on starting a docker container that contains the tensorflow binary image.

To start a docker container that contains a tensorflow binary image, enter a command in the following format:

$ docker run -it -p hostPort:containerPort TensorFlowImage

Among them:

  • “- P hostport: containerport” is optional. If you want to run the tensorflow program from the shell, omit this option. If you want to run the tensorflow program from jupyter notebook, set “hostport” and “containerport” to 8888. If you want to run tensorboard inside the container, add another – P flag and set “hostport” and “containerport” to 6006.
  • ‘tensorflowimage’ is required. It indicates the docker container. You must specify one of the following values:

    • Gcr.io/tensorflow/tensorflow: tensorflow binary image.
    • Gcr.io/tensorflow/tensorflow: latest devel: tensorflow binary image and source code.

Gcr.io is the Google container registry. Some tensorflow images are also provided on dockerhub. Docker will download the tensorflow binary image the first time you start it.

For example, the following command starts a tensorflow CPU binary image in a docker container through which you can run the tensorflow program in the shell:

$ docker run -it gcr.io/tensorflow/tensorflow bash

Then call Python from shell, as shown below:

$ python

Enter the following lines of short program code in the python interactive shell:

# Python
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

If the system outputs the following, you can start writing tensorflow program:

Hello, TensorFlow!

The actual rendering is as follows:

Build tensorflow development environment based on docker

The following command can also start a tensorflow CPU binary image in the docker container. However, in this docker container, you can run the tensorflow program in jupyter Notebook:

$ docker run -it -p 8888:8888 gcr.io/tensorflow/tensorflow

Here I run the above code normally in jupyter notebook through browser.

The actual rendering is as follows:

Build tensorflow development environment based on docker

Reference resources

  • Getting started with docker
  • Docker Practice Series
  • Installing tensorflow on MacOS

It’s not easy to write articles. Maybe it’s only a few minutes to write these codes. It’s a few days’ gestation to write an acceptable article, and then add a few days’ code words. I’m tired and happy. If the article helps you, please let me have a cup of coffee!

Build tensorflow development environment based on docker