IBM open-sources Kubeflow Pipelines on Tekton for portable machine learning models
IBM Corp. said today it’s hoping to provide a standardized solution for developers to create and deploy machine learning models in production and make them portable to any cloud platform.
To do so, it said it’s open-sourcing the Kubeflow machine learning platform on Tekton, a continuous integration/continuous development platform developed by Google LLC. It’s popular with developers who use Kubernetes to manage containerized applications, which can run unchanged across many computing environments.
IBM said it created Kubeflow Pipelines on Tekton in response to the need for a more reliable solution for deploying, monitoring and governing machine learning models in production on any cloud platform. That’s important, IBM says, because hybrid cloud models are rapidly becoming the norm for many enterprises that want to take advantage of the benefits of running their most critical business applications across distributed computing environments.
One of the main complaints of data science and application development and delivery teams is the difficulty of creating portable machine learning models that can run on any cloud, Animesh Singh, IBM’s chief architect of data and AI open source, and Trent Gray-Donald, a distinguished engineer in AI, said in a blog post. They cite problems such as manual handoffs, frantic monitoring and loose governance, which can cause big delays in the deployment of new models.
IBM’s response to this has been to double down on Machine Learning Operations, or MLOps, which refers to a practice for collaboration and communication between data scientists and operations professionals to help manage production ML lifecycles.
IBM has since adopted Kubeflow Pipelines as the basis of its MLOps. Kubeflow is a machine learning platform that’s focused on distributed training, hyperparameter optimization, production model serving and management. Singh and Gray-Donald said Kubeflow was chosen primarily for its ability to tap into the Kubernetes ecosystem, which makes it compatible with containers and gives it greater scalability.
Kubeflow Pipelines also uses the Python interconnected that’s familiar to many developers for defining and deploying Pipelines. That enables metadata collection and lineage tracking of ML models.
Now, IBM has decided to open-source a revamped version of Kubeflow Pipelines especially for Tekton, which is an open-source project that provides a framework to create cloud-native continuous integration/continuous deployment pipelines quickly. With Tekton, developers can build, test, and deploy CI/CD pipelines across multiple cloud providers or on-premises systems by abstracting away the underlying implementation details.
“The decision to adopt Kubeflow Pipelines on our side came with an internal requirement to redesign Kubeflow Pipelines to run on top of Tekton,” Singh and Gray-Donald said.
Available today, the new Kubeflow Pipelines on Tekton provides Kubernetes-style resources for developers to declare CI/CD pipelines. It also introduces several new Custom Resource Definitions such as Task, Pipeline, TaskRun and PipelineRun.
In addition, IBM has created OpenShift Pipelines based on Tekton for its Red Hat OpenShift platform. That’s used to automate the building, testing and deployment of containerized applications for on-premises and public cloud platforms.
“Within IBM, we have standardized on Tekton as a cloud CI/CD engine, and OpenShift Pipelines is based on Tekton,” Singh and Gray-Donald said.
IBM is also planning to add to its stack with the coming launch of Watson AI Platform Pipelines. That will provide features that make it even easier for developers to build ML pipelines with a drag-and-drop canvas, pipeline components registry and production-grade logging. It will also provide better governance capabilities via integrations with IBM Watson Studio, Watson AutoAI, Watson Knowledge Catalog and others.