Altair SmartWorks Analytics

 

Deployments

The Deployments page is the second of two tabs that display when the MLOps node is opened and shows a list of models that have been deployed.

This page provides a means through which users can deploy models to obtain an endpoint where they can send API requests and receive responses for predictions.

 

The Main Page has a Deployments tab, with a table having the following columns

  • Deployment Name

  • Created by – the user who created that model

  • Last Updated – when this deployment was last updated

  • Status – the current status of the deployment, explained below

You can sort any of these columns by clicking on the sorting buttons located to the right of each column header. Typing the first few characters of a deployment name into the Search box located at the upper right-hand corner of the page displays all deployments corresponding to this string. Different deployment profiles can be viewed by clicking and selecting the appropriate profile from the drop-down list located beside the Search bar.

You can also register add a new deployment version by clicking on the Add Deployment button.

Statuses

  • LIVE – The deployment is up and running. The user can send requests to the model endpoint. If the requests are formatted correctly, then they should receive a response

  • DEPLOYING – The deployment is being made for the first time

  • UPDATING – The deployment is being updated. This will happen if the deployment was previously in the LIVE state and the user changed some settings, such as changing the model, resource allocation, number of pods, etc. While the deployment is in this state, the user can send requests to the model endpoint. If the requests are formatted correctly, then they should receive a response

  • FAILED – The deployment has failed when Kubernetes tried trying to create the underlying resources: deployments, services, pods

NOTE: Deployments can sometimes get stuck in the pending state on Kubernetes. This can happen in situations where for example there are not enough resources on the cluster for the requested pods. In this case, the deployment state will be shown as either DEPLOYING or UPDATING, depending on what was happening at the time

 

Adding a New Deployment

Steps:

  1. Click on the Add Deployment button in the Deployments page.

  2. The Add New Deployment page displays.

     

  3. Provide the following properties:

  4. Property

    Description

    Deployment Name

    The name assigned to the deployment. This name displays in the Deployments page.

    Deployment Type

    Choose one of three possible deployment types:

    • Regular – a single model receiving all of the traffic

    • A/B/n Testing – multiple models with a user-specified traffic distribution (Note: The traffic total for all models must add up to 100%)

    • Champion / Challenger – two models, each receiving the exact same traffic but only the champion sends the response back to the requester

    Model Repository

    The MLflow internal connection under which the model will be registered

    Model Name

    Specify a model for deployment; this model must be added to the Model Registry

    Model Version

    Specify the version of the selected model to use

    Scaling Properties

    Autoscaling - Allows the user to set the min and max pods for the Kubernetes Horizontal Pod Autoscaler, along with the target average % utilization of CPU and RAM

    Manual - Allows the user to set the number of pods and the Kubernetes resource requests for CPU and RAM

    When the scaling type has been selected, specify the amount of resources to provide for the deployment:

    • CPU

    • RAM

    • Number of pods

    • CPU Target %

    • Ram Target %

    • Min Pods Count

    • Max Pods Count

     

    Inputs must be in Kubernetes format; for example:

    • CPU = 100m

    • RAM = 32Mi

     

  5. Click Deploy when you are finished.

The new deployment displays in the Deployments page.

Viewing a Deployment

The properties of a deployment display when you click on a deployment from the Deployments page.

 

Editing a Deployment

When a specific deployment is selected in the Deployments page, several buttons display at the top of the page.

 

You can use these buttons to open, edit, or remove (delete) the selected deployment. Selecting Edit displays the deployment properties page. You can, for example, change the  model to use, the deployment type, the amount of resources/scaling type, or the deployment profile.

Request-Response Formats

Authentication

Before doing any requests on a deployed model endpoint, you will have to retrieve a TOKEN from Keycloak. You can do so by executing the following script:

 

curl --data "username=UserAdmin&password=MyPassword&grant_type=password&client_id=istio-gateway" \

     https://smartworks.altair.com/auth/realms/smartworks/protocol/openid-connect/token

 

 

Where, in the above code:

  • “UserAdmin“ is the Keycloak username

  • “MyPassword“ is the Keycloak password

  • “istio-gateway“ is the name of a client in Keycloak and has been configured for authentication on the Seldon cluster using an AuthorizationPolicy

You can then put the TOKEN into an environment variable called TOKEN and follow the rest of the information below for sending requests.

Requests

You can send a total of three inputs to a Seldon Core endpoint with your requests:

  • data – required; this must be a numpy array

  • names – optional / sometimes required; this must be a list

  • meta – optional; this must be a dictionary

The following example shows a request using curl:

 

curl -X POST https://seldon.smartworks.altair.com/seldon/seldon-qa/seldon-model/api/v1.0/predictions \

     -H 'Authorization: Bearer $TOKEN' \

     -H 'Content-Type: application/json' \

     -d '{ "data": { "ndarray": [[33, "Male", 50], [32, "Female", 60]], "names": ["age", "sex", "hours-per-week"] } }'

 

 

The following examples shows another request in Python:

 

import requests

import json

### VARIABLES

TOKEN_URL = "https://smartworks.altair.com/auth/realms/smartworks/protocol/openid-connect/token"

AUTH_DATA = {"username": "*******",

             "password": "*******",

             "grant_type": "password",

             "client_id": "istio-gateway"}

### Get token

r = requests.post(url=TOKEN_URL, data=AUTH_DATA)

token = r.json()["access_token"]

### Run request on endpoint

DEPLOYMENT_NAME = "my-deployment"

MODEL_ENDPOINT = "https://seldon.smartworks.altair.com/seldon/seldon-qa/{}/api/v1.0/predictions".format(DEPLOYMENT_NAME)

REQUEST_DATA = { "data": { "ndarray": [[33, "Male", 50, 'Widowed'], [32, "Female", 60, 'Widowed']], "names": ["age", "sex", "hours-per-week", 'marital-status'] } }

REQUEST_HEADERS = {'Authorization': 'Bearer {}'.format(token), "Content-Type": "application/json"}

r = requests.post(url=ML_REQUEST_URL, data=json.dumps(REQUEST_DATA), headers=REQUEST_HEADERS)

print(r)

print(r.text)

 

 

Responses

The response from Seldon always follow the same format, namely:

  • data – which has two items:

    • names – a list of names, which are the names of the classes which have been predicted upon

    • ndarray – the predictions as a 2D array

  • meta – any metadata or tags that are from the ML model

The example below shows a response. Note that the format of the data in the ndarray may change for Sklearn vs. Knowledge Studio vs. Python models.

 

{

    "data": {

      "names": [0, 1, 2],

      "ndarray": [[0.9612952553605776, 0.03840242822212667, 0.0003023164172958162],

                  [0.0019173336702884656, 0.2109311182562371, 0.7871515480734745]]

    },

    "meta": {}

}