Linux运维知识之2 个 K8S 命令轻松部署深度学习模型-职坐标

Linux运维知识之2 个 K8S 命令轻松部署深度学习模型

小标 2019-03-15 来源：阅读 1413 评论 0

摘要：本文主要向大家介绍了Linux运维知识之2 个 K8S 命令轻松部署深度学习模型，通过具体的内容向大家展现，希望对大家学习Linux运维知识有所帮助。

本文主要向大家介绍了Linux运维知识之2 个 K8S 命令轻松部署深度学习模型，通过具体的内容向大家展现，希望对大家学习Linux运维知识有所帮助。

Linux运维知识之2 个 K8S 命令轻松部署深度学习模型

如今利用 Keras 构建深度学习模型已然成为一种风尚。Kubernetes 无需人工干预即可全部自动化完成任务的特点，使之成为部署学习模型的绝佳选择。今天我们就从 Keras 构建深度学习模型角度出发，通过 Flask 提供 REST API 服务，看看如何利用 Kubernetes 两个命令轻松部署深度学习模型。接下来，我将在 Google Cloud 上进行全程部署。

主要内容包括以下四部分：

使用 Google Cloud 创建环境；

使用 Keras、Flask 和 Docker 作为 API 提供深度学习模型；

使用 Kubernetes 部署所述模型；

结语。

使用 Google Cloud 创建环境

首先在 Google Compute Engine 上使用一个小型虚拟机来构建、服务和定位深度学习模型。我曾试图在 Windows 10 笔记本电脑上安装最新版本的 Docker CE（Community Edition），但我失败了。

所以我决定利用 Google Cloud 来进行部署，这要比弄清如何安装 Docker 更高效。

启动 Google Cloud VM，打开屏幕左侧的功能区，选择 “Select Compute” 再选择 “Create Instance”，你将看到已经运行了一个实例。

下一步选择使用的计算资源大小。默认（read: cheapest）设置可以正常工作，但鉴于我们最多只需要这个 VM 约 1 小时，我选择了 4vCPU 和 15GB 内存。

接下来，选择使用的操作系统和磁盘空间。我们选择 “Boot Disk” 以编辑默认值，将 CentOS 7 作为操作系统，并将磁盘容量从 10GB 增加到 100GB（CentOS 操作系统不是必要的）。我建议将磁盘大小增加到 10GB 以上，因为我们创建的 Docker 容器每个大约 1GB。

创建 VM 前需要将防火墙规则设置为：Allow HTTP traffic、Allow HTTPS traffic 。在模型部署到 Kubernetes 之前，我将进行防火墙设置，保证在 VM 上能测试到我们的 API。因此，检查下图这些方框是不够的 —— 还有更多工作要做。

现在点击 “Create”， Google Cloud 创建环境成功！

使用 Keras 构建深度学习模型

通过 SSH 进入到我们的 VM 并开始构建模型。最简单的方法是只需单击 VM 旁边的 SSH 图标（如下所示），浏览器会打开一个终端。

删除现有版本 Docker

sudo yum remove docker docker-client docker-client-latest docker-common docker-latest docker-latest-logrotate docker-logrotate docker-selinux docker-engine-selinux docker-engine

请注意，如果选择 CentOS 7 以外的操作系统，这些命令会有所不同。

安装最新版本 Docker

sudo yum install -y yum-utils device-mapper-persistent-data lvm2
sudo yum-config-manager — add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install docker-ce

启动 Docker 并运行测试脚本

sudo systemctl start docker
sudo docker run hello-world

如果你看到的输出如下所示，则完成设置。

Hello from Docker!
This message shows that your installation appears to be working correctly.To generate this message, Docker took the following steps: 1. The Docker client contacted the Docker daemon. 2. The Docker daemon pulled the "hello-world" image from the Docker Hub. (amd64) 3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading. 4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal.

创建深度学习模型

对 Adrian Rosebrock 编写的脚本进行复制。我对 Adrian 脚本进行两次编辑才能让它运行起来。如果你不关心 Docker 和 Tensorflow 的细节，请跳过这两段。

与 Docker 有关问题：在本地运行应用程序的默认行为是在本地主机（127.0.0 …）上提供应用程序，但应用程序在 Docker 容器内运行时会出现问题。解决方法:当调用 app.run() 时，将 url 指定为 0.0.0.0，类似于 app.run( host=’0.0.0.0′ )。现在，我们的应用程序可在本地和外部 IP 上使用。

与 Tensorflow 有关问题：当我运行 Adrian 原始脚本时，我无法成功调用模型。利用下面的解决方案，对代码进行后续更改。

global graph
graph = tf.get_default_graph()
...
with graph.as_default():
preds = model.predict(image)

创建一个名为 “keras-app” 的新目录并将模型移动到该目录中。

mkdir keras-app
cd keras-app

创建一个名为 “app.py” 的文件。你可以选择你喜欢的文本编辑器。我更喜欢使用 Vim，先创建并打开 app.py 类型：

vim app.py

从文件内部，点击 “i” 键进入插入模式。将此代码进行粘贴：

# USAGE
# Start the server:
#     python app.py
# Submit a request via cURL:
#     curl -X POST -F image=@dog.jpg '//localhost:5000/predict'# import the necessary packages
from keras.applications import ResNet50
from keras.preprocessing.image import img_to_array
from keras.applications import imagenet_utils
from PIL import Image
import numpy as np
import flask
import io
import tensorflow as tf

# initialize our Flask application and the Keras model
app = flask.Flask(__name__)
model = None

def load_model():
    # load the pre-trained Keras model (here we are using a model
    # pre-trained on ImageNet and provided by Keras, but you can
    # substitute in your own networks just as easily)
    global model
    model = ResNet50(weights="imagenet")
    global graph
    graph = tf.get_default_graph()

def prepare_image(image, target):
    # if the image mode is not RGB, convert it    if image.mode != "RGB":
        image = image.convert("RGB")

    # resize the input image and preprocess it
    image = image.resize(target)
    image = img_to_array(image)
    image = np.expand_dims(image, axis=0)
    image = imagenet_utils.preprocess_input(image)

    # return the processed image    return image

@app.route("/predict", methods=["POST"])
def predict():
    # initialize the data dictionary that will be returned from the
    # view
    data = {"success": False}

    # ensure an image was properly uploaded to our endpoint    if flask.request.method == "POST":        if flask.request.files.get("image"):
            # read the image in PIL format
            image = flask.request.files["image"].read()
            image = Image.open(io.BytesIO(image))

            # preprocess the image and prepare it for classification
            image = prepare_image(image, target=(224, 224))

            # classify the input image and then initialize the list
            # of predictions to return to the client            with graph.as_default():
                preds = model.predict(image)
                results = imagenet_utils.decode_predictions(preds)
                data["predictions"] = []

                # loop over the results and add them to the list of
                # returned predictions                for (imagenetID, label, prob) in results[0]:
                    r = {"label": label, "probability": float(prob)}
                    data["predictions"].append(r)

                # indicate that the request was a success
                data["success"] = True

    # return the data dictionary as a JSON response    return flask.jsonify(data)

# if this is the main thread of execution first load the model and
# then start the serverif __name__ == "__main__":
    print(("* Loading Keras model and Flask starting server..."
        "please wait until server has fully started"))
    load_model()
    app.run(host='0.0.0.0')
view rawapp.py hosted with ❤ by GitHub

复制上述代码后，按 “Esc” 键退出编辑模式，键入保存并关闭文件 :x。

创建 requirements.txt 文件

我们将在 Docker 容器中运行上述代码。现在先创建一个 requirements.txt 文件。该文件将包含代码所需的运行包，例如 Keras、Flask 等。这样保证无论我们将 Docker 容器放置在哪，底层服务器都能安装代码所需依赖项。

创建并打开一个名为 “requirements” 的文件。通过键入 vim requirements.txt，用 vim 创建 txt 。将以下内容复制到 requirements.txt 中，进行保存并关闭。

keras
tensorflow
flask
gevent
pillow
requests

创建 Dockerfile

对 Dockerfile 进行创建，如下是 Docker 为构建和运行项目而读取的文件:

FROM python:3.6
WORKDIR /app
COPY requirements.txt /app
RUN pip install -r ./requirements.txt
COPY app.py /app
CMD ["python", "app.py"]~

现在我们指示 Docker 下载 Python 3 的基本镜像。完成后，我们要求 Docker 使用 Python 包管理器 pip 安装 requirements.txt 中的详细的包，再通知 Docker 运行脚本 python app.py。

构建 Docker 容器

构建并测试应用程序，首先要构建 Docker 容器，请运行：

sudo docker build -t keras-app:latest .

这条代码指示 Docker 为当前工作中的代码构建容器 director keras-app。此命令将需要一两分钟才能完成。后台的 Docker 也会同时下载 Python 3.6 的镜像并安装requirements.txt 中列出的软件包。

运行 Docker 容器

运行 Docker 容器来测试应用：

sudo docker run -d -p 5000:5000 keras-app

关于 5000:5000 的简单说明 ——设置的 Docker 让外部端口 5000 可用，并将本地应用程序转发到该端口（也在本地端口 5000 上运行），通过检查运行容器的状态，运行命令 sudo docker ps -a 可看到类似如下内容：

[gustafcavanaugh@instance-3 ~]$ sudo docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d82f65802166 keras-app "python app.py" About an hour ago Up About an hour 0.0.0.0:5000->5000/tcp nervous_northcutt

测试模型

让该模型接收一张狗的照片作为输入，命令返回狗的品种作为测试。

从终端运行：

curl -X POST -F image=@dog.jpg '//localhost:5000/predict'

确保 “dog.jpg” 在你当前的目录中（或提供该文件的相应路径）

显示结果如下：

{"predictions":[{"label":"beagle","probability":0.987775444984436},
{"label":"pot","probability":0.0020967808086425066},
{"label":"Cardigan","probability":0.001351703773252666},
{"label":"Walker_hound","probability":0.0012711131712421775},
{"label":"Brittany_spaniel","probability":0.0010085132671520114}],"
success":true}

我们的模型正确地将狗归类为比格犬。这说明你已成功运行了一个训练有素的深度学习模型与 Keras、Flask 一起服务，并用 Docker 对它进行包装。

使用 Kubernetes 部署模型

创建 Docker Hub 帐户

将构建的模型上传到 Docker Hub。（如果你没有 Docker Hub 帐户，请立即创建一个帐户）。我们不会将容器移动到 Kubernetes 集群中，而是指示 Kubernetes 从集中托管的服务器中（即 Docker Hub）安装我们的容器。

创建 Docker Hub 帐户后，从命令行登录 sudo docker login。你需要提供用户名和密码，就像登录网站一样。

如果你看到如下消息：

登录成功！

标记容器

在上传之前标记容器的命名。运行 sudo docker images 并找到 keras-app 容器的镜像 ID。

输出如下所示：

REPOSITORY          TAG                 IMAGE ID
CREATED             SIZE keras-app           latest
ddb507b8a017        About an hour ago   1.61GB

标记 keras-app，请务必遵循我的格式，将 image id 和 docker hub id 的值替换为你的特定值。

#Format
sudo docker tag <your image id> <your docker hub id>/<app name>
#My Exact Command - Make Sure To Use Your Inputs
sudo docker tag ddb507b8a017 gcav66/keras-app

容器传送到 Docker Hub

从 shell 运行容器：

#Format
sudo docker push <your docker hub name>/<app-name>
#My exact command
sudo docker push gcav66/keras-app

当你导航回 Docker Hub 网站，将会看到 keras-app 存储库。

创建 Kubernetes 集群

在 Google Cloud 主屏幕上，选择 “Kubernetes Engine”。

再创建一个新的 Kubernetes 集群。

接下来，我们将自定义集群中的 Node 大小。我将选择 4vCPU 和 15 GB 的 RAM。你可以使用较小的集群尝试此操作。请记住，默认设置会启动 3 个 Node，因此你的集群将拥有你提供资源的 3 倍，即在我的情况下，45 GB 的 RAM。

单击进行创建，你需要一到两分钟才能启动集群。单击 “Click Run in Cloud Shell” 以显示 Kubernetes 集群的控制台。请注意，在你创建和测试的 Docker 容器中，这是一个独立于 VM 的 shell 环境。我们可以在 VM 上安装 Kubernetes，但 Google 的 Kubernetes 服务可以为我们提供自动化安装。

在 Kubernetes 上运行我们的 Docker 容器。请注意，镜像标记仅指向 Docker Hub 上的托管 Docker 镜像。另外，指定 port 在端口 5000 运行应用程序。

kubectl run keras-app --image=gcav66/keras-app --port 5000

在 Kubernetes 中，容器都在 Pod 内部运行。我们可以通过输入 kubectl get pods 验证 Pod 是否运行。结果如下则输入正确：

gustafcavanaugh@cloudshell:~ (basic-web-app-test)$ kubectl get pods
NAME READY STATUS RESTARTS AGE
keras-app-79568b5f57-5qxqk 1/1 Running 0 1m

现在 Pod 正在运行，我们需要将端口 80 上的 Pod 暴露给外网。这意味着访问我们部署的 IP 地址的同时也可以访问我们的 API。也就是说我们不必在网址之后再指定一个端口号（告别 :5000）。

kubectl expose deployment keras-app --type=LoadBalancer --port
80 --target-port 5000

通过运行 kubectl get service 来确定部署状态（以及确定需要调用 API 的 URL）。若命令输入如下则运行正确：

gustafcavanaugh@cloudshell:~ (basic-web-app-test)$ kubectl get service
NAME         TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)        AGE
keras-app    LoadBalancer   10.11.250.71   35.225.226.94   80:30271/TCP   4m
kubernetes   ClusterIP      10.11.240.1    <none>          443/TCP        18m

抓住 Keras 应用程序的 cluster-ip，打开本地终端（或任何有狗照片的地方）并运行以下命令来调用 API curl -X POST -F image=@dog.jpg '//<your service IP>/predict' 。

结果如下所示，API 正确返回图片的 beagle 标签：

$ curl -X POST -F image=@dog.jpg '//35.225.226.94/predict'
{"predictions":
[{"label":"beagle","probability":0.987775444984436},
{"label":"pot","probability":0.0020967808086425066},
{"label":"Cardigan","probability":0.001351703773252666},
{"label":"Walker_hound","probability":0.0012711131712421775},
{"label":"Brittany_spaniel","probability":0.0010085132671520114}],
"success":true}

结语

在本教程中，我们使用 Keras 和 Flask 作为 REST API 提供深度学习模型，而后我们将该应用程序放在 Docker 容器中，将容器上传到 Docker Hub，并使用 Kubernetes 进行部署。

只需两个命令，Kubernetes 就可以部署我们的应用程序并将其暴露给外网。是不是感觉很自豪？当然这个方案还有很多可以改进的空间。比如，我们可以将 Flask 中的 Python Web 服务器从本地 Python 服务器改为类似 gunicorn 产品级别的服务器；我们还可以探索 Kubernetes 的扩展性和管理功能等等。

本文由职坐标整理并发布，希望对同学们有所帮助。了解更多详情请关注系统运维Linux频道！

linux系统运维教程 linux系统 linux运维工程师待遇

本文由 @小标发布于职坐标。未经许可，禁止转载。