Integrate data from HP NonStop SQL/MX to Databricks using Matillion

The HP NonStop SQL/MX to Databricks connector enables seamless and timely data migration to Databricks, automatically keeping your data current without the need for manual programming or management of complex ETL scripts.

HP NonStop SQL/MX
Databricks
HP NonStop SQL/MX to Databricks banner

Extracting data from HP NonStop SQL/MX to Databricks

Extracting data from HP NonStop SQL/MX is an essential task for organizations seeking to leverage modern analytics platforms such as Databricks. This article provides a practical guide to moving data efficiently from HP NonStop SQL/MX into Databricks, covering each key part of the workflow. First, we will outline the process of creating an identity within HP NonStop SQL/MX to ensure secure access and proper permissions for data extraction. Next, for users of Matillion, we will discuss how to check for an existing JDBC driver or acquire and configure the appropriate one to facilitate connectivity. We will also address network connectivity considerations, ensuring that the data exchange between the HP NonStop SQL/MX source and the Databricks target is reliable and secure. Finally, we will examine strategies for both initial and incremental data extraction, to enable not just a one-time data transfer but efficient ongoing synchronization. By following this step-by-step approach, readers can confidently set up a data integration pipeline tailored to their organization's analytical needs.


What is HP NonStop SQL/MX?

HP NonStop SQL/MX is an advanced relational database management system designed for the HPE NonStop server platform, prioritizing continuous availability, scalability, and robust transactional consistency. It provides ANSI SQL compliance—supporting complex queries, referential integrity, and advanced data types—making it well-suited for mission-critical applications in industries such as banking, telecommunications, and retail. Built to leverage the unique non-stop architecture of HPE servers, NonStop SQL/MX enables both OLTP and OLAP workloads through features such as distributed data access, parallel execution, and integrated fault tolerance mechanisms. Its architecture ensures seamless load balancing, fast failover, and online maintenance, allowing organizations to maintain high service levels without downtime.

matillion logo x HP NonStop SQL/MX

What is Databricks?

Databricks is a unified analytics platform that uses Apache Spark for distributed processing and provides a collaborative environment for big data and AI workloads. Its managed lakehouse architecture integrates data engineering, science, and analytics, anchored by Delta Lake—a database component with ACID compliance for reliable data ingestion, transformation, and querying. Supporting both structured and unstructured data, Databricks excels at ETL pipelines, machine learning, and advanced analytics. Native integration with cloud storage (Azure Blob, AWS S3, Google Cloud Storage) ensures scalable, flexible, and secure data management for a wide range of enterprise use cases.

Why Move Data from HP NonStop SQL/MX into Databricks

Unlocking Advanced Analytics: Migrating Data from HP NonStop SQL/MX to Databricks

A data engineer or architect may consider copying data from HP NonStop SQL/MX into Databricks for several compelling reasons. First and foremost, HP NonStop SQL/MX often holds data that is potentially valuable for business insight and decision-making. However, the true utility of this data is unlocked when it can be integrated with information from other enterprise sources, enabling comprehensive analysis and the discovery of new patterns or trends. By utilizing Databricks as the data integration and analytics platform, organizations can perform these complex data transformations and analyses without imposing additional processing load on the HP NonStop SQL/MX system itself. This approach helps to preserve the performance and reliability of the mission-critical HP NonStop environment while allowing the business to extract maximum value from its data assets.

Creating a User in HP NonStop SQL/MX

To create a new user (identity) in HP NonStop SQL/MX, you typically use the CREATE USER statement. This process can only be performed by a database user (such as

DB__ROOT
or another user with
CREATE USER
privilege) with the appropriate privileges.

Below are step-by-step instructions and example SQL scripts for creating a user in SQL/MX.


Prerequisites

  • You must have access to SQL CI (
    sqlci
    ), MXCI (
    mxci
    ), or use ODBC/JDBC clients with sufficient privilege.
  • You must connect with an account that has
    CREATE USER
    privilege (
    DB__ROOT
    or equivalent).

Steps to Create a User

1. Connect to the SQL/MX Database

Use your preferred client utility to connect to your SQL/MX data source:

ell
$ mxci

or

ell
$ sqlci

Authenticate as a user with adequate privileges.


2. Execute the CREATE USER Statement

The following is the basic syntax for creating a user:

CREATE USER <user_name> IDENTIFIED BY '<password>' [, ATTRIBUTE '<attribute_name>' = '<attribute_value>'];

Example: Create a Simple User

CREATE USER my_user IDENTIFIED BY 'MySecurePassword';

Example: Create a User With Specific Attributes

CREATE USER app_admin 
    IDENTIFIED BY 'Str0ngAdm1nPass!'
    ATTRIBUTE 'DEFAULT_SCHEMA' = 'APPDB'
    ATTRIBUTE 'DEFAULT_CATALOG' = 'CATALOG1';
-
DEFAULT_SCHEMA
: (optional) Specify the default schema for the user. -
DEFAULT_CATALOG
: (optional) Specify the default catalog for the user.


3. Grant Required Privileges to the User

Newly created users often require additional privileges to access and manipulate database objects.

GRANT SELECT, INSERT, UPDATE, DELETE ON mydb.mytable TO my_user;
You can tailor this grant to match your security policies.


4. Commit Your Changes (if AUTOCOMMIT is off)

If your session does not autocommit SQL statements by default:

COMMIT;


Additional Notes

  • Usernames and passwords are subject to the platform's password policies and case sensitivity settings.
  • You can manage, alter or drop users using the corresponding
    ALTER USER
    and
    DROP USER
    statements.
  • Always follow organizational security standards for user naming and password complexity.

Installing the JDBC driver

At the time of writing, the HP NonStop SQL/MX JDBC driver is not included by default in Matillion Data Productivity Cloud due to licensing and redistribution restrictions. Therefore, users are required to manually obtain and integrate the driver into their Matillion environment in order to connect to the HP NonStop SQL/MX database.

Step 1: Download the JDBC Driver

To acquire the JDBC driver for HP NonStop SQL/MX, visit the official HPE support portal at https://support.hpe.com/. Please note:

  • Search for "NonStop SQL/MX JDBC driver" once on the site.
  • Prefer the Type 4 JDBC driver, as it is pure Java and does not require native libraries, simplifying cloud integration.
  • You may need appropriate credentials or a support subscription to access the download section.

Step 2: Review Distribution Terms

Due to licensing requirements from HPE, ensure that you review and agree with the terms stated during the download process. Confirm that you are permitted to use the driver within your organization.

Step 3: Upload the JDBC Driver to Matillion Data Productivity Cloud

Once you have obtained the driver (usually a

.jar
file), you must upload it into your Matillion instance:

Step 4: Use the Driver in Your Data Workflows

Once installed, you will be able to create connections and run actions against your HP NonStop SQL/MX instance:

By following these steps, technical users can download, install, and utilize the HP NonStop SQL/MX JDBC driver within Matillion, enabling seamless data integration workflows.

Checking network connectivity

To enable connectivity between Matillion Data Productivity Cloud and your HP NonStop SQL/MX database, you must ensure that the database is configured to allow incoming network connections based on your deployment:

Additionally, if your HP NonStop SQL/MX database is accessed using a DNS hostname rather than a static IP address, ensure that the corresponding Matillion agent (Full SaaS or Hybrid SaaS) is able to resolve the DNS address correctly. This may require configuring appropriate DNS settings or allowing access to your DNS server.

Querying Data from HP NonStop SQL/MX Database

This guide explains how to query data from an HP NonStop SQL/MX database, especially in the context of loading data into another analytics platform like Databricks. Examples use standard SQL/MX syntax and cover full data loads and incremental loads.

1. Example: SQL SELECT Statements in HP NonStop SQL/MX

Basic Select Statement

SELECT customer_id, customer_name, created_date
FROM SALES_DB.CUSTOMERS;

Select with WHERE Filter (for Incremental Load)

SELECT customer_id, customer_name, created_date
FROM SALES_DB.CUSTOMERS
WHERE created_date > '2024-06-01';

Type Conversion Example

When transferring data to Databricks (Apache Spark), be aware that SQL/MX datatypes may require conversion. E.g.:

HP NonStop SQL/MX Type Databricks (Spark) Type
CHAR/VARCHAR STRING
NUMERIC/DECIMAL DECIMAL
INTEGER INT
TIMESTAMP TIMESTAMP
DATE DATE

SELECT order_id, 
       CAST(order_amount AS DECIMAL(18,2)) AS order_amount, 
       order_date
FROM SALES_DB.ORDERS;

2. Loading Pattern: Initial Load and Incremental Loads

The recommended approach for robust data ingestion is:

Initial (Full) Load

  • Fetch all records
  • No filter clause in your
    SELECT

SELECT *
FROM SALES_DB.ORDERS;
Matillion Database Query Component Settings: - Query: The select statement above contains no predicate; all data is loaded.

Incremental Load

  • After the initial load, fetch only new or updated records since the last sync.
  • Use a filter clause with a "high water mark" field (e.g., last update timestamp or an auto-incrementing key).

SELECT *
FROM SALES_DB.ORDERS
WHERE order_date > :LAST_LOAD_DATE;
Matillion Database Query Component Settings: - Query: Contains a
WHERE
clause utilizing a parameter or variable to fetch only changed records.

Learn More: See detailed strategies for incremental load patterns in the Matillion Incremental Load Data Replication Strategy.

3. Using the Same Database Query Component

Both the initial and incremental loads should utilize the same Database Query component, differing only in how the WHERE clause is applied:

  • Initial Load: Omit the filter clause.
  • Incremental Load: Apply a filter using user-defined parameters (e.g.,
    WHERE updated_at > ?
    ).

This maximizes consistency in your ETL process and reduces maintenance efforts.

Data Integration Architecture

A core advantage of the ELT (Extract, Load, Transform) architecture lies in its "divide and conquer" approach, where data is first loaded into the Databricks database before any transformation or integration processes begin. By separating the loading and transformation steps, ELT enables organizations to stage raw data efficiently and address each challenge in a focused manner. Data integration inherently requires complex transformation operations, and the most robust and manageable way to achieve this is through data transformation pipelines. Databricks supports building these pipelines natively, which brings another significant benefit: transformations and integration are executed directly within the Databricks database itself. This design ensures that all integration tasks are performed using the scalable compute resources of Databricks, offering high performance and on-demand processing without incurring extra costs for external data engines or infrastructure.

Get started today

Matillion's comprehensive data pipeline platform offers more than point solutions.