Databases (Incuding Hadoop Hive/IMPALA/Shark)

Relational databases, including Oracle, SAP Sybase IQ/ASE, MySQL, MS SQL Server, and MS Access, are accessed via this connector. Some databases connect using OLE DB, while others connect using ODBC and the Microsoft OLE DB for ODBC driver.

When connecting via ODBC, the appropriate SQL dialect must be specified, as different databases have different syntax. This is especially the case when dealing with null handling, time bucketing, and shipping aggregation to the database.

HadoopHive should be used for:

Hive, Impala, Spark, Shark & Spark SQL

AnsiSQL should be used for any remaining database not listed such as Greenplum.

SQL Server refers to MS SQL Server.

While Sybase is divided into:

Sybase ASE
Sybase IQ / AQA

As they have different syntax.

If there is a need for additional SQL dialect, please contact your sales representative.

NOTE: For the Access database, it is recommended to install the appropriate 32-bit or 64-bit version of the ACE driver to the corresponding Panopticon Designer (Desktop) version.

The Star Schema data model is now introduced to Database connectors wherein a large table (or fact table) is joined to one or several dimension tables. Consequently, instead of using a full table scan on large tables, joining the key of the fact table with the corresponding key of the dimension table reduces the data set processed. To further reduce the process time, you can also query the schema with on demand, or with selected aggregates, and parameters.

Adding a Database

When creating a new data table, select Database from the Connect to data dialog. The Data Link Properties dialog displays.

Select the appropriate OLE DB provider for your database.
If the database has an ODBC driver, and you have already setup an ODBC DSN, select Microsoft OLE DB Provider for ODBC Drivers.

Click the Next >> button. The Connection tab of the Data Link Properties dialog is displayed. This tab will be dependent on the selected provider. The screen shot below shows the tab for an ODBC connection.

In this case:

- Select the predefined Data Source Name (DSN).
- Enter username and password as appropriate.
- Check the Allow saving password box.
- Select the initial catalog to use as appropriate.
- Test the connection.

Click OK. The Database Connection dialog is displayed.

The Connection Settings collapsible section, which includes the data connection strings that can be edited manually or graphically via the Edit… button as required.

Java connection settings can be applied for those who will be publishing workbooks to Panopticon Visualization Server. Either the JNDI name or JDBC URL should be entered.

- JNDI Name
Enter the JNDI resource name to be used, then the Username and Password.
- URL

Enter the JDBC database connection URL, the Driver Class Name specific to the driver, and the Username and Password.

This section can be left blank if the Panopticon Visualization Server is not utilized.

The dialog allows selection of columns from these Tables and Views, or alternatively by selecting the Query expander button, entering of SQL queries, or stored procedures.

Once either a table or view or a query has been selected the OK button is enabled.

Selection of a Table and columns dynamically generates the SQL query, which is displayed in the Query text box.

Select the appropriate SQL dialect to be able to generate the correct SQL for the required data repository. The default dialect is ANSI SQL.

If Tables & Views is selected, the section below is enabled.

Clicking Load Tables loads the list of tables and views in the left list box (Fact Tables), which can be filtered by entering text in the Search Tables search box. The same list of tables and views is loaded in the right list box (Dimension Tables), which can be filtered by entering text in the Search Tables list box.

NOTE: The star schema data model is not supported for Access Dialect. Hence, the Join section (or the Dimension Tables list box) is not enabled.

Select the fact table. This will be the “center” of the star schema data model. Consequently, the selected fact table will no longer be displayed in the Dimension Tables list box.

Also, the corresponding list of available columns (<Table name>.<Column name>) is displayed in the Search Columns section. By default, all columns are returned through SELECT * FROM TABLE.

Click the Query expander button to display the generated query:

Perform a join by checking one or more dimension table.

You can also add duplicate columns (Numeric or Date/Time).

They are added in the Output Columns list.

You can opt to delete the duplicate columns by clicking .

Limit the number of columns in the star schema by checking the corresponding Column box in the Output Columns listing. To add all columns, click the topmost check box.

If you wish to parameterize a specific column, check the Parameterize check box, and match the parameter to the appropriate column. By default, they will be matched on name.

If a parameter has been defined in the data table, the SQL Query can automatically be modified to refer to it.

NOTE: The SQL Query can be manually defined or modified.

If the data returned is to be aggregated, then the Aggregate box should be checked.

For each selected column the possible aggregation methods are listed including:

- Text Columns: Group By
- Date Columns: Count, Min, Max, Group By
- Numeric Columns: Sum, Count, Min, Max, Group By

The time zone of input parameters and output data is by default unchanged.

Changing the time zone is supported through the Timezone list box, based on the assumption that data is stored in UTC time and outputs are presented in the selected time zone.

As Panopticon Designer (Desktop) requires date time fields, if timespan fields are returned from the database, they are converted to date times using the defined Timespan base time.

If a manual query has been selected, rather than using the Table & View query builder, two additional options are enabled.

- Enclose parameters in quotes
- Allow in memory parameter filtering

If only a selected date time range of the table / view is to be queried, the selected time column or columns should be selected, then the constraint checkbox checked, and the From and To text boxes completed; either with values or with parameters.

It is more efficient to define the date time constraint in the dialog, than by entering a manual query, especially if Panopticon Designer (Desktop) will be dynamically issuing on demand queries, as nested select statements will be avoided.

If a manual query has been selected, rather than using the Table & Views query builder, three additional options are enabled:

- Uncheck Enclose parameters in quotes. This removes the quotes when adding parameter values, and is typically used when parameterizing returned column or table names. By default this option is checked, as the common use case for parameters is as a filter WHERE clause.
- Check Allow In-Memory parameter filtering. This allows the whole dataset to be returned, and then filtered in memory. This process is much less efficient than adding the parameter as a WHERE clause of the SQL query; however, it may be efficient in cases where small sets of records are returned on a very frequent basis.
- Check Use Data Modification Query. This signals that the table is created for writing data. This property is also used for filtering out target data tables for further data update action configuration.

The Connection Settings section can be expanded for additional configuration. This can include Java connectivity, which will solely be used by the Panopticon Visualization Server for data retrieval, and the .NET Connection Name, which may be solely used by the Panopticon Visualization Server.

For the connection to the .NET (or Panopticon Designer (Desktop)), enter the following information:

- Connection Name
- Fallback Connection String

For the connection to the Panopticon Visualization Server, set or enter the following information:

- JNDI Name or JDBC URL
- Username and Password

Note that these connection settings can be parameterized.

IMPORTANT:

In Designer, you are required to make certain settings related to the JNDI data source, even though Designer is not using for its own purpose. The settings are instructions for the Panopticon Server, telling it which data source to use for this particular data table in the Panopticon Designer (Desktop) workbook.

For example, in Designer, enter the following value:

Java:comp/env/jdbc/sqliteSABR

Where jdbc/sqliteSABR is the value of the name-attribute in the JNDI resource entry in panopticon.xml in Panopticon Visualization Server.

Refer to the Database section in the Panopticon Visualization Server Installation and Troubleshooting Guide for more information.

Click OK to confirm the selection and retrieve the record set into Panopticon Designer (Desktop).

The flat record set corresponding to the executed SQL is returned from the source database and displayed in the Edit Data Table view, with the selected database name as the title, and selected fields listed displayed in the Data Source Preview.

Click in the data source box.

The data source specific Settings dialog displays with the following information:

- Row Limits
- Connection Strings
- Query/Tables & Views Settings

Click in the Data Preview area. The Datasource - Transforms dialog displays with the following information:

- Pivot
- Unpivot
- R Transform
- Python Transform
- Time Series Transform

Click on a field dropdown in the Data Preview area to:

- Modify the data types.
- Change the display Titles.
- Define the display formats for numeric fields. The default setting is: #,##0.00
- Define new auto-generated Key column, calculated column, or grouping column