페이지

2022년 2월 23일 수요일

Installing Kubeflow in GCP

 Like AWS, Google Cloud Platform(GCP) provides a managed Kubernetes control plane, GKE. We can install Kubeflow in GCO using the following steps:

1. Register for a GCP account and create a project on the console

This project will be where the various resources associated with Kubeflow will reside.

2. Enable required services

The services required to run Kubeflow on GCP are:

* Compute Engine API

* Kubernetes Engine API

* Identity and Access Management(IAM) API

* Deployment Manager API

* Cloud Resource Manager API

* Cloud Filestore API

* AI Platform Training & Prediction API

3. Set up OAuth(optional)

If you wish to make a secure deployment, then, as with AWS, you must follow instructions to add authentication to your installation, located at (https://www.kubeflow.org/docs/gke/deploy/oauth-setup/). Alternatively, you can just use the name and password for your GCP account.

4. Set up the GCloud CLI

This is parallel to the AWS CLI covered in the previous section. Installation instructions are available at https://cloud.google.com/sdk/. You can verify your installation by running:

gcloud --help

5. Download the kubeflow command-line tool

Links are located on the Kubeflow releases page(https://github.com/kubeflow/kubeflow/releases/tag/v0.7.1). Download one of these directories and unpack the tarball using:

tar -xvf ktctl_v0.7.1_<platform>.tar.gz

6. Log in to GCloud and create user credentials

We next need to create a login account and credential token we will use to interact with resources in our account.

gcloud auth login

gcloud auth application-default login

7. Set up environment variable and deploy Kubeflow

As with AWS, we need to enter values for a few key environment variables: the application containing the Kubeflow configuration files(${KF_DIR}), the name of the Kuveflow deployment (${KF_NAME}), the path to the base configuration URI (${CONFIG_URI} - for GCP this is https://raw. githubusercontent.com/kubeflow/manifests/v0.7-branch/ktdef/ktctl_gcp_iap.0.7.1.yaml), the name of the Google project ($PROJECT}), and the zone it runs in (${ZONE}).

8. Launch Kubeflow

The same as AWS, we use Kustomize to build the template file and launch Kubeflow:

mkdir -p ${KF_DIR}

cd ${KF_DIR}

kfctl apply -V -f ${CONFIG_URI}

Once Kubeflow is launched, you can get the URL to the dashboard using:

kubectl -n istio-system get ingress


Installing Kubeflow in AWS

 In order to run Kubeflow in AWS, we need a Kubernetes control plane available in the cloud. Fortunately, Amazon provides a amanged service called EKS, which provides an easy way to provision a control plane to deploy Kubeflow. Follow the following steps to deploy Kubeflow on AWS:


    1. Register for an AWS account and install the AWS Command Line Interface

This need to interact with the various AWS services, following the instructions for your platform located at https://docs.aws.amaxon.com/cli/latest/userguid/cli-chap-install.html. Once it is installed, enter:

aws configure

to setup your account and key information to provision resources.

    2. Install ekstl

This command-line utility allows us to provision a Kubernetes control plane in Amazon from the command line. Follow instructions at https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html to intall.

    3. Install iam-authenticator

To allow kubectl to interact with EKS, we need to provide the correct permissions using the IAM authenticator to modify our kubeconfig. Please see the installation instructions at https://docs.aws.amazon.com/eks/latest/userguide/install-aws-iam-authenticator.html.

    4. Download the Kubeflow command-line tool

Links are located at the Kubeflow releases page(https://github.com/kubeflow/kubeflow/releases/tag/v0.7.1).Download one of these directories and unpack the tarball using:

tar -xvf ktctl_v0.7.1_<platform>.tar.gz

    5. Build the configuration file

After entering environment variables for the Kubeflow application director (${KF_DIR}), the name of the deployment (${KF_NAME}), and the path to the base configuration file for the deployment ($(CONFIG_URI}), which is located at https://raw.githubusercontent.com/kuveflow/manifests/v0.7-branch/kfdef/kfctl_aws.0.7.1.yaml from AWS deployments, run the following to generate the configuration file:

mkdir -p ${KF_DIR}

cd ${KF_DIR}

kfctl build -V -f ${CONFIG_URI}

This will generate a local configuration file locally named kfctl_aws.0.7.1.yaml. If this looks like Kustomize, that's becuase kfctl is using Kustomize under the hood to build the configuration. We also need to add an environment variable for the location of th local config file, ${CONFIG_FUILE}, which in this case is:

export CONFIG_FILE={KF_DIR}/kfctl_aws.0.7.1.yaml

    6. Launch Kubeflow on EKS

Use the following commands to lunch Kubeflow:

cd {KF_DIR}

rm -rf kustomize/

kfctl apply -V -f ${CONFIG_FILE}

It will take a while for all the Kubeflow components to become available;

you can check the progress by using the following command:

kubectl -n kubeflow get all

Once they are all available, we can get the URL address for the Kuberflow dashboard using:

kubectl get ingress -n istio-system

This will take us to the dashboard view shown in the MiniKF examples above.

Note that in the default configuration, this address is open to the public; for secure applications, we need to add authentication using the instructions at https://www.kubeflow.org/docs/aws/authentication/.



2022년 2월 20일 일요일

Running Kubeflow locally with MiniKF

 If we want to get started quickly or prototype our application locally, we can avoid setting up a cloud account and instead use virtual machines to simulate the kind of resources we would provision in the cloud. To set up Kubeflow locally, we first need to install VirtualBox(https://www.virtualbox.rog/wiki/Downloads)to run virtual machines, and Vagrant to run configuration for setting up a Kubernetes control plane and Kubeflwo on VirtualBox VMs(https://www.vagrantup.com/downloads.html)

Once you have these dependencies installed, create a new directory, change into it, and run:

vagrant init arrito/minikf

vagrant up

This initializes the VirtualBox configuration and brings up the application. You can now navigate to http://10.10.10.10/ and follow the instructions to launch Kubeflow and Rok (a storage volume for data used in experiments on Kubeflow created by Arrikto). Once these have been provisioned, you should see a screen like this(Figure 2.5):

Log into Kuberlflow to see the dashboard with the various components

We will return to these components later and go through the various functionalities available on Kubeflow, but first,  let's walk through how to install Kubeflow in the cloud.

Kubeflow: an end-to-end machine learning lab

 As was described at the begining of this chapter, there are many components of an end-to-end lab for machine learning reserch and development(Table 2.1), such as:

- A way to manage and version library dependencies, such as TensorFlow, and packge them for a reproducivle computing environment

- Interactive research environments where we can visualize data and experiment with different settings

- Provisioning of resources to run the modeling process in a distributed manager

- Robust mechanisms for snapshotting historical version of the research process


As we described earlier in this chapter, TensorFlow was designed to utilize distributed resources for training. To leverage this capability, we will use the Kubeflwo projects. Built on top of Kubeflow has several components that are useful in the end-to-end process of managing machine learning applications. To install Kubeflow, we need to have an exising Kubernetes control plane instance and use kubectl to launch Kubeflow's various components. The steps for setup differ slightly depending upon whether we are using a local instance or one of the major cloud providers.


Kustomize for configuration management

 Like most code, we most likely want to ultimatyely store the YAML files we use to issue commands to Kubernetes in a version control system such as Git. This leads to some cases where this format might not be ideal: for example, in a machine in a machine learning pipeline, we might perform hyperparameter searches where the same application is being run with sightly dirreent parameters, leading to a glut of duplicate command files.

Or, we might have arguments, such as AWS account keys, that for secuyrity reasons we do not want to store in a text file. We might also want to increase reuse by splitting our command into a base and additions; for example, in the YAML file show in Code 2.1, if we wanted to run ngnix alongside different databases, or specify file storage in the different cloud object stores provided by Amazon, Google, and Microsoft Azuere.


For these use cases, we will make use of the Kustomize tool(https://kustomize.io), which is also available through kubectl as:

kubectl apply -k <kustomization.yaml>

Altenatively, we could use the Kustomize command-line tool. A kustomization. yaml is a template for a Kubernetes application; for example, consider the following template for the training job in the Kubeflow example respositiory (http://github.com/kubeflow/pipelines/blob/master/mainfests/kustomize/sample/kustomization.yaml):

apiVersion: kustomize.config.k8s.io/v1beta1

kind: Kustomization


bases:

    # Or

# github.com/kubeflow/pipelines/manifests/kustomize/env/gcp?ref=1.0.0 

    - ../env/gcp

    # Kubeflow Pipelines servers are capable of 

    # collecting Prometheus metrics.

    # If you want to manitor your Kubeflow Pipelines servers

    # with those metrics, you'll need a Prometheus server

    # in your Kubeflow POipelines cluster.

    # If your Kubeflow Pipelines cluster.

    # If you don't already have a Prometheus server up, you

    # can uncomment the following configuration files for Prometheus.

    # If you have your own Prometheus server up already

    # or you don't want a Prometheus server for monitoring,

    # you can comment the flollwing line out.

    # - ../third_party/prometheus

    #- ../third_party/grafana


# Identifier for application manager to apply ownerReference.

# The ownerFeference ensures the resources get garbage collected

# when application is deleted.

commonLabels:

    application-crd-id: kubeflow-pipelines

    #Used by Kustomize

    configMapGenerator:

        - name: pipeline-install-config

        env: params.env

        behavior: merge

    

    secretGenerator:

        -name: mysql-secret

        env: paras-db-secret.env

        behavior: merge


    # !!! If you want to customize the namespcae,

    # please also update

    # sample/cluster-scoped-resources/kustomization.yaml's

    # namespace field to the same value

    namespace: kubeflow


    ### Customizaiotn ###

    # 1. Change values in params.env file

    # 2. Chage values in rarams-db-secret.env

    # file for CloudSQL username and apssword

    # 3. kubectl apply -k ./

    ###

We can see that this file refers to a base set of configurations in a separate kustomization.yaml file located at the relative path ../base. To edit variables in this file, for instance, to change the namespace for the application, we would run:

kustomize edit set namespace mykube

We could also add configuration maps to pass to the training job, using a key-value format, for example:

kustomize edit add configmap configMapGenerator --from-

literal=myval-myval

Finally, when we are read to execute these commands on Kubernetes, we can build the necessary kubectl command dynamically and apply it, assuming kustomization. yaml is in the current directory.

kustomize build . |kubectl apply -f-

Hopefully, these exampoles demonstrate how Kustomize provides a flexible wazy to generate the YAML. we need for kubectl using a template; we will make use of it often in the process of parameterizing our workflows later in this book.

Now that we have coverd how Kubernetes manages Docker applications in the cloud, and how Kustomize can allow us to flexibly reuse kubectl yaml commands, let's look at how these components are tied together in Kubeflow to run the kinds of experiments we will be undertasking later to create generative AU model in TensorFlow.

2022년 2월 19일 토요일

Important Kubernetes commands

 In order to interact with aKubernetes cluster running in the cloud, we typically utlize the Kubernets command-line tool(kuberctl). Instructions for installing kubectl for your operating system can be found at(https://kubernetes.io/docs/tasks/tools/install-kubectl/).To verify that you have successfully installed kubectl, you can again run the help command in the terminal

kubectl --help

Like Docker, kubectl has manay commands; the important one that we will use is the apply command, which, like docker-compose, thakes in a YAML file as input and communicates with the Kubernetes control plane to start, update, or stop pods:

kubectl apply -f <file.yaml>

As an example of how the apply command works, let us look at a YAML file for deploying a web server(nginx) application:

apiVersion: v1

kind: Service

metadata:

    name: my-nginx-svc

    labels:

        app: nginx

    spec:

        type: LooadBalacer

        ports:

            - ports: 80

        selector:

            app: nginx

---

apiVersion: apps/v1

kind: Deployment

metadata:

    name: my-nginx

    labels:

        app: nginx

    spec:

        replicas: 3

        selector:

            matchLabels:

                app: nginx

        template:

            metadata:

                labels:

                    app: nginx

            spec:

                containers:

                    -name: nginx

                    image: nginx:1.7.9

                    ports:

                        - containerPort: 80

The resources specified in this file are created on the Kubernetes cluster nodes in the order in which they are listed in the file. First, we create the load balancer, which routes external traffic between copies of the nginz web server. The metadata is used to gag these applications for querying later using kubectl. Secondly, we create a set of 3 replicas of the nginx pod, using a consistent container (image 1.7.9), which users port 80 on their repective containers.

The same set of physical resources of a Kubernetes cluster can be shared among serral virtual clusters using namespaces-this allows us to segregate resources among multiple users or groups. This can allow, for example, each team to run their own set of applications and logically behave as if they are the only users. Later, in our discussion of Kubeflow,we will see how this feature can be used to logically partition projects on the same Kubeflow instance.


Kuberanetes: Robust management of multi-container applications

 The Kubernetes project - sometimes abbreviated as k8s- was born out of an internal container management project at Google known as Borg. Kubernetes comes from the Greek word for navigator, as denoted by the seven-spoke whell of the project's log. Kubernetes is written in the Go protgramming language and provids a robust framework to deploy and manage Docker container applications on the underlying resources managed by cloud proviers(such as Amazon Web Service(AWS), Microsoft Azure, and Google Cloud Platform(GCP)).

Kubernetes is fundamentally a tool to control applications composed of one or more Docker containers deployed in the cloud: this collection fo containers is known as a pod. Each pod can bave one or more copies (to allow redundancy), which is known as a replicaset. The two main components of a Kubernetes deployment are a control plane and nodes. The control plane hosts the centralized log for ldeploying and managing pods, and consists of (Figure 2.4):

- Kube-api-server: This is the main application that listens to commands from the user to deploy or update a pod, or manages external access to pods via ingress.

- Kube-controller-manager: An application to manage functions such as controlling the number of replicas per pod.

- Cloud-controller-manager: Manages functions particular to a cloud provider.

- Etcd: A key-value store that maintains the environment and state variables of different pods.

- Kube-scheduler: An application that is responsibile for finding workers to run a pod.

While we could set up own control plane, in practice we will usually have this function managed by our cloud provider, such as Google Kubernetes Engine(GKE) or Amazon's Elastic Kubernetes Services(EKS). The Kubernetes nodes-the individual machines in the cluster - each run an application known as a kubelet, which monitors the pod(s) running on that node.

Now that we have a high-level view of the Kubernetes system, let's look at the important commands you will need to interact with a Kubernetes cluster, update its components, and start and stop applications.