페이지

2022년 2월 26일 토요일

Kubeflow pipelines

 For notebook servers, we gave an example of a single container (this notebook instace) application. Kubeflow also gives us the ability to run multi-container application worksflows(such as input data, training, and deployment) using the piplines functionality. Pipelines are Python functions that follow a Domain Specific Language(DSL) to specify components that will be compiled into containers.

If we click piplies on the UI, we are brought to a dashboard

Selecting one on these pipelines, we can ses a visual overview of the component containers


After create a new run, we can specify parameters for a particular instace of this  pipeline.

Once the pipeline is created, we can use the user interface to visualize the results.

Under the hood, the Python code to generate this pipline is compiled using the pipelines SDK. We could specify the components to come either from a container with Python code:


@kfp.dsl.componet

def my_component(my_pram):

    ...

    return kfp.dsl.ContainerOp(

        name='My componet name',

        image='gcr.io/path/to/container/image'

    )

    or a function written in Python itself:

    @kfp.dsl.python_component(

        name='My awesome component',

        description='Come and play',

    )

    def my_python_func(a: str, b: str) -> str:


For a pure Python function, we could turn this into an operation with the compiler:

my_op    =    compiler.build_python_component(

        component_func=my_python_func,

        staging_gcs_path=OUTPUT_DIR,

        target_imge=TARGET_IMAGE)


We then use the dsl.pipeline decorator to add this operation to a pipeline:

    @kfp.dsl.pipeline(

        name='My pipeline',

        description='My machine learning pipline'

    )

    def my_pipline(param_1: PipelineParam, param_2: PipelineParam):

        my_step = my_op(a='a', b='b')


We compile it using the following code:

    kfp.compiler.Compiler().compile(my_pipeline, 'my-pipeline.zip')

and run it with this code:

    client = ktf.Client()

    my_experiment = client.create_experiment(name='demo')

    my_run=client.run_pipeline(my_experiment.id, 'my-pipelie', 'my-pipeline.zip')

We can also upload this ZIP file to the pipelines UI, where Kubeflow can use the generated YAML, from compilation to instantiate the job.

Now that you have seen the process for generating results for a single pipeline, our next problem is how to generate the optimal parameters for such a pipeline. As you will see in Chapter 3, Building Blocks of Deep Neural Networks, neural network models typically have a number of layers, layer size, and connectivity) and training paradigm (such as learning rate and optimizer algorithm). Kubeflow has a built-in utility for optimizing models for such parameter grids, called Katib.

댓글 없음: