Altair SmartWorks Analytics

 

Joining Data

Joins allow you to combine tables side-by-side and require a key column that is present in both tables. This column is used to match rows between tables.

In the following example, we will join two tables. Table 1 contains employee information, while Table 2 contains payroll information.

Table 1 containing employee information.

 

Table 2 containing payroll information.

 

Prerequisite

  • An Execution Profile with an active session linked to the workflow

Steps

  1. Produce two Data Frame nodes by importing two data tables.

  2. From the Data Preparation group of the Nodes tabbed page, drag and drop the Join node from the Palette to the Workflow Canvas. The Join node has two input sockets and one output socket. Connect the output socket of each of the Data Frame nodes produced in Step 1 to the input sockets of the Join node.

  3.  

  4. Configure the Join node by double-clicking on the node or using the Open option provided in the node menu.

  5.  

  6. Specify the following settings in the Join Node Viewer:

  7. Property

    Description

    Input Properties

    Table Names - The name of the tables to be joined. This field is automatically populated from the Data Frame node and cannot be edited.

    Output Properties

    Table - The table name.

     

    Join Type

    You can create:

    • Standard Joins, including Inner Join, Left Outer Join, Right Outer Join, or Full Outer Join

    • Filtering Joins, including Left Semi Join, Left Anti Join, Right Semi Join, or Right Anti Join.

     

    Descriptions of these join types display when you select a join type.

    Key Columns

    A key column is a column found in both tables containing values to be used to match records for the joined table.

    Select a common column between your data inputs by clicking the Key icons located to the left of the relevant columns in the Columns to Include section.

     

    When a key column is selected from the Columns to Include section of the Configuration panel, it is added to the Key Columns section and grayed out in the Columns to Include section.

    Columns to Include

    Specify which columns from the data inputs should be included in the join output. A search option is also provided to address inputs with numerous columns.

     

    Included Column Names

    Set prefixes or suffixes for columns in your data inputs if you wish. These prefixes/suffixes will display in the data output.

     

     

    When you have finished configuring your join table, a preview of your data displays in the right-hand side of the Node Viewer.

     
  8. Check the code that will be executed for your specified Join configuration by saving your current configuration and then clicking on the Code tab of the Join Node Viewer. You can use the Code Editor to refine the code further.

  9.  

  10. Complete the Join Node configuration by clicking Save. To cancel your changes and return to the Workflow Canvas, press Discard or simply close the Join Node Viewer. To execute the Join, click the Run button.