Dotscience Enables Simplest Method for Building, Deploying and Monitoring ML Models in Production on Kubernetes Clusters to Accelerate the Delivery of Business Value from AI

Dotscience, a pioneer in DevOps for Machine Learning (MLOps), announced new platform advancements that offer the easiest way to deploy and monitor ML models on Kubernetes clusters, making Kubernetes simple and accessible to data scientists. New Dotscience Deploy and Monitor features dramatically simplify the act of deploying ML models to Kubernetes and setting up monitoring dashboards for the deployed models with cloud-native tools Prometheus and Grafana, reducing the time spent on these tasks from weeks to seconds. Dotscience now also enables hybrid and multi-cloud scenarios where, for example, model training can happen on-prem using an attached Dotscience runner, and models can then be easily deployed to a Kubernetes cluster in the cloud for inference using a Dotscience Kubernetes deployer. Dotscience also announced a joint effort with S&P Global to develop best practices for collaborative, end-to-end ML data and model management that ensure the delivery of business value from AI.

“While there are visionaries like S&P in the market who also recognize the need for reproducibility, provenance and enhanced collaboration in the model development phase of the lifecycle, our push to simplify deployment and monitoring of AI/ML is based on the market insight that many businesses are still struggling with deploying their ML models, blocking any business value from AI/ML initiatives,” said Luke Marsden, CEO and founder of Dotscience. “In addition, monitoring models in ML-specific ways is not obvious to software-focused DevOps teams. By dramatically simplifying deployment and monitoring of models, Dotscience is making MLOps accessible to every data scientist without forcing them to set up and configure complex and powerful tools like Kubernetes, Prometheus and Grafana from scratch.”

While other solutions on the market aim to solve only specific parts of ML development and operations, requiring further integration work in order to provide end-to-end functionality, Dotscience enables data science and ML teams to own and control the entire model development and operations process, from data ingestion, through training and testing, to deploying straight into a Kubernetes cluster, and monitoring that model in production to understand its behavior as new data flows in. Furthermore, alongside the built-in Jupyter environment, Dotscience users can now use any development environment they like by using the Dotscience Python library.

“This allows teams to work faster, at scale and with confidence that their development process is fully accountable. Also, by enabling hybrid and multi-cloud scenarios, where training happens on-prem where the data is, and the deployment to production and inference happens in the cloud where Kubernetes is easy to set up, we enable flexible use of on-prem infrastructure along with easy access to harness the power of Kubernetes in the cloud,” continued Marsden.

Dotscience Deploy Makes Kubernetes Simple and Accessible to Data Scientists with a Single Command or Click

Data science and ML teams can use Dotscience to ingest data, perform data engineering, train and test models and then deploy them to CI for further testing before final deployment to production with a single click, command or API call where the models can then be statistically monitored.

Dotscience’s Deploy gives users the ability to:

  • Handle both building the ML model into a Docker image and deploying it to a Kubernetes cluster
  • Hand the entire CI/CD responsibility over to existing infrastructure, if preferred, or use lightweight built-ins
  • Track the deployment of the ML model back to the provenance of the model and the data it was trained on to maintain accountability across the entire ML lifecycle

“In keeping with Dotscience’s product philosophy of maximizing interoperability, users can choose to deploy their models from Dotscience through the CI tool of their choice, including using Dotscience’s built-in CI step if they don’t have or want to have their own,” said Mark Coleman, VP of Product and Marketing at Dotscience. “If a problem is reported with a deployed ML model, it is simple to trace back from the model running in production to the full provenance of data, code and hyperparameters that created the model in development, making debugging and auditing intuitive and fast.”

Users can deploy their models in three main ways:

  • UI deployments – After defining parameters in the UI, users can deploy straight from within the Dotscience Hub interface
  • CLI style – The Dotscience CLI tool ‘ds’ can be used to deploy an ML model using command line parameters to define the exact details
  • From the Python library – deploy directly from the python library with ds.publish(deploy=True), which also automatically sets up a statistical monitoring dashboard

Dotscience Monitor Tracks Health of ML Models Throughout Entire Lifecycle

Dotscience’s statistical monitoring feature allows ML teams to define which metrics they would like to monitor on their deployed models and then bring those metrics straight back into the Dotscience Hub interface where the team first developed the model. This allows ML teams to “own” the health of the model throughout the entire development lifecycle and avoids integrations with other monitoring solutions and costly handovers between teams. By enabling data science teams to own the monitoring of their models, Dotscience brings the notion of integrated DevOps teams to ML, eliminating silos, maximizing productivity and minimizing mean time to recovery (MTTR) if there are issues with a model.

“Often there’s a disconnect between the type of monitoring performed by operations teams, such as error rates and request latency, and the type of monitoring that machine learning teams need to do on their models when deployed to production, such as looking at the statistical distribution of predicted categories” said Marsden. “With Dotscience, ML teams have insight into the context specific monitoring information about their model, which better positions them to understand why an error occurred and respond to it, rather than putting this onus on a central operations team.”

Unblocking AI in the Enterprise

S&P Global found that at least 50% or more of the time models fail to get past innovation—where new ideas, prototyping and testing take place—due to challenges in the incubation phase like building robust data pipelines in order to retrain models in the future, and going from single data scientists doing ad-hoc work on local machines, to attempting to scale out the team. What typically results is a significant waste of time and effort, and these challenges also risk AI efforts being blocked within the enterprise.

“In order to be able to reliably transition AI models from innovation, and rapidly through incubation where tested ideas are readied for production, and finally into productionization, the following properties of a system are required: collaboration, provenance, resource management, automated model workflows, holistic experiment management, hyperparameter optimization, model storage, a deployment strategy, monitoring capabilities and the ability to scale a model,” said Ganesh Nagarathnam, Director of Machine Learning at S&P Global. “Without these MLOps business drivers, more than 50% of AI models in the enterprise get blocked before they reach production where they can deliver business value. I’m excited to work with the Dotscience team to develop best practices for unblocking AI in the enterprise.”

“We’re proud to work with Ganesh and S&P Global to address and solve for the challenges many in data science and ML community face today,” said Marsden. “The new Dotscience Deploy and Monitor capabilities strongly align with the MLOps business drivers that S&P see as essential to unblock deployment of AI models in the enterprise.”

Sign up for the free insideAI News newsletter.