Altair SmartWorks Analytics

 

Registering Models

This page describes how different models are registered in SmartWorks Analytics.

Registering Sklearn Models

To register an Sklearn model, you must upload a single .pkl file.

Supported Versions for Sklearn

In this release, only a subset of package versions are supported:

  • 0.21.3

  • 0.22.2.post1

  • 0.23.2

  • 0.24.1

Registering Knowledge Studio Models

To register a Knowledge Studio model, you must upload a single .py file.

The specific format for registering Knowledge Studio models is as follows:

  • If you have a Single Model, then it must have a top level Python class that has a single predict() function

    • If the class has an attribute IVs, then the predict() function should take in the variables as is in the same order as the IVs list

    • If the class does not have the IVs attribute, then the predict() function should take in a Pandas row as input

    • The predict() function can return a numpy array, dictionary, or list

  • If the user has a Ensemble, then it must have a top level Python class, with sub-classes

    • Each sub-class is a Single Model with the IVs attribute and predict() function that takes the variables as is, in the same order as the IVs list

    • Each Single Model has a predict() that returns data in the following format

      • For classification models: node_id, node_number, probabilities (multiple items), node_size

      • For regression models: node_id, node_number, prediction, standard deviation, node_size

    • The DVCategories must be defined for all sub-classes if the model is for classification. If it is for regression, then it must not exist for all sub-classes

Supported versions of Knowledge Studio models with MLOps

  • 2021.1.0

Supported Knowledge Studio Models

Predictive Models

  • Linear Regression

  • Logistic Regression

  • Deep Learning - Regression

  • Deep Learning - Classification

  • Factor Analysis

  • PCA

  • Regularization

  • Strategy Tree

  • Decision Tree - Regression

  • Decision Tree - Classification

  • Reject Inference

Ensembles

  • Bagging - Regression

  • Bagging - Classification

  • Boosting - Classification

  • Random Forest - Regression

  • Random Forest - Classification

** Note that Boosting is only valid for binary classification, so no regression here

Registering Python Models

You must upload a single .py file to register a Python model.

The specific format for registering Python models is as follows:

  • The model must be a class that extends the mlflow.pyfunc.PythonModel class from MLflow. The class must have a predict(self, context, model_input) function inside of it.

  • In the predict() function, the model_input variable is a Pandas DataFrame. The data that are returned from this function must be either a Python list() or an ndarray.

  • The user can have several other types of function within their class to perform pre-processing, post-processing, etc.

  • Inside of the container that gets deployed for this model, pandas==1.1.3 and numpy==1.20.3 will be installed. Therefore, the user can leverage those libraries to  process their data if they wish.

An example code is shown below.

 

import mlflow

class MyModel(mlflow.pyfunc.PythonModel):

    def preprocess(self):

        pass

    def postprocess(self):

        pass

    def predict(self, context, model_input):

        out = list()

        for i, x in model_input.iterrows():

            if x["size"] > 50:

                out.append([True, 10])

            else:

                out.append([False, 20])

        return out

 

 

Supported versions of Python models with MLOps

  • 3.7

Registering Models from MLflow UI

Models that were created by the AutoML node will appear in the MLflow UI. You can register them from the MLflow UI so that they show up in the MLOps Node’s Model Registry.  

 

 

 

 

When a model is successfully registered from the MLflow UI, it displays in the Model Registry page.

 

This newly registered model will also appear in the MLflow UI in the Models and Experiments sections.

You can modify the tags of a models via a Code node.

NOTE: If you wish to eventually deploy your model, the packageVersion that you tag for the model must be supported for that MODEL_TYPE in MLOps.

 

from mlflow.tracking import MlflowClient

import mlflow

MODEL_NAME = "workflow-iris-automl-model"

MODEL_VERSION = "2"

CREATED_BY = "George"

MODEL_TYPE = "SKLEARN"

SKLEARN_VERSION = "0.23.2"

TAGS = {"type": MODEL_TYPE, "createdBy": CREATED_BY, "packageVersion": SKLEARN_VERSION}

client = MlflowClient()

for k, v in TAGS.items():

    # Set tag for the specifc model version

    client.set_model_version_tag(name=MODEL_NAME, version=MODEL_VERSION, key=k, value=v)