Altair® Panopticon

 

Python Transform

A Python script can be executed as a data transformation step in the data pipeline.  Specifically:

  • Data is retrieved from an underlying source.

  • The returned data table is translated into a Python object; specifically a list of dictionaries.

  • The Python object, and supplied Python Script are passed to an external Python process running Pyro.  (Python Remote Objects) e.g. https://pypi.python.org/pypi/Pyro4/

  • The external Pyro process returns a list of dictionaries

  • The returned list of dictionaries is translated into a Panopticon Designer (Desktop) table for visualization rendering.

 

NOTES:

  • When used with streaming data sources (e.g., message bus), the Real Time Limit of a streaming data source should be set to a value longer than the time taken to perform the Python data transform.

For example, if the transform operation takes 2 seconds, the Real Time Limit should be set to 2500 milliseconds.

  • When used for non-streaming data sources (e.g., Database), the Data Table Refresh period should be set to a value longer than the time taken to perform the Python data transform.

For example, if the transform operation takes 2 seconds, the Data Table Refresh period should be set to 3 seconds.

 

When the Python Transform is selected the dialog changes to show:

 

Performing Python Transform

  1. Check Enable Python Transform.

  2. Specify the Host and Port of the Pyro process, along with the HMAC key (Password).

  3. Specify the Data Object Name. This defines the data structure (list of dictionaries) that Panopticon Designer will produce, and then be utilized by the Python Script.

  4. Select the Serialization Type: Serpent or Pickle

    • Serpent – simple serialization library based on ast.literal_eval

    • Pickle – faster serialization but less secure

Modify the configuration.py file located in ..\Anaconda3\Lib\site-packages\Pyro4 to specify the serialization to be used.

For example, if Pickle is selected, self.SERIALIZER value should be changed to pickle and self.SERIALIZERS_ACCEPTED value should be changed to include pickle:

 

 

  1. Enter the Python Script or load from an associated file (selected by clicking the Browse button). This returns the output list of dictionaries. Just like an underlying SQL query, the Python script itself can be parameterized.

NOTE: This step will work for small and simple use cases. However, when you have several transforms, or when each transform is applied to several data tables, it is highly recommended to follow the instructions in the Best Practices on Working with Python Transform in Panopticon section.

 

  1. Click Test Connection. A confirmation dialog displays to show that the connection was successfully established.

  2. Specify whether to Enclose Parameter in Quotes.

  3. The Timeout is set to 10 seconds by default to ensure that slow running Python scripts do not impact other areas of the product. You can opt to enter a new value.

  4. Click Apply. This prepares the time series analysis.

  5. Refer to Enable Time Series Analysis for more information in enabling this feature.

Enabling the time series analysis when you perform a Python Transform solves the problem of having to specify all of the values. It also allows you to choose which Time column should be used to specify the time series.

  1. Click OK.