Extracting data from HP NonStop SQL/MX to Snowflake
Extracting data from HP NonStop SQL/MX is an essential process for organizations seeking to leverage modern cloud analytics platforms such as Snowflake. HP NonStop SQL/MX, known for its reliability and scalability in mission-critical environments, often stores valuable operational data that can be further utilized through Snowflake’s advanced analytics capabilities. Successfully transferring data between these systems requires a clear understanding of both the source and target environments. In this article, we will guide you through each step necessary to establish a robust data pipeline from HP NonStop SQL/MX to Snowflake. We will begin by outlining the creation of a database identity in HP NonStop SQL/MX, ensuring secure and appropriate access. For users of Matillion ETL, we will address the requirements for checking or acquiring the necessary JDBC driver to facilitate connectivity. We will then examine the network connectivity needed from NonStop SQL/MX to Snowflake, highlighting key considerations to ensure a smooth data transfer. Finally, we will explore techniques for querying data, covering both initial extraction and strategies for incremental loading to keep your data in Snowflake up to date. Whether you are setting up your first NonStop SQL/MX to Snowflake integration or seeking to optimize an existing pipeline, the following sections will provide clear, practical guidance to help you achieve your data integration goals.
What is HP NonStop SQL/MX?
HP NonStop SQL/MX is a high-performance, relational database management system designed for the mission-critical environments of Hewlett Packard Enterprise’s NonStop servers. Leveraging the NonStop architecture's inherent fault tolerance and scalability, SQL/MX supports ANSI-compliant SQL and extends robust transaction management, parallel query processing, and distributed database capabilities. Its unique architecture ensures continuous availability and data integrity, making it ideal for industries such as financial services, telecommunications, and retail, where downtime is unacceptable. Additionally, SQL/MX offers seamless integration with modern analytics frameworks and supports mixed workloads, providing flexibility for organizations migrating from legacy databases or consolidating heterogeneous data assets.
What is Snowflake?
Snowflake is a cloud-native data warehousing platform with a multi-cluster shared data architecture, enabling high performance, concurrent processing, and elastic scalability. Unlike traditional databases, it separates storage and compute, letting organizations scale resources independently. Snowflake supports both structured and semi-structured data (JSON, Avro, Parquet) natively, and offers strong data sharing, security, and governance features. It integrates seamlessly with analytics and BI tools using standard SQL, making it an attractive choice for enterprises migrating to the cloud or modernizing data infrastructure without managing hardware.
Why Move Data from HP NonStop SQL/MX into Snowflake
Unlocking Advanced Analytics: The Benefits of Copying Data from HP NonStop SQL/MX to Snowflake
A data engineer or architect may find it advantageous to copy data from HP NonStop SQL/MX into Snowflake for several compelling reasons. HP NonStop SQL/MX often houses mission-critical transactional data that is potentially valuable for business intelligence, analytics, or operational reporting. However, the real value of this data is fully realized when it is integrated with data from other sources, enabling richer analytics and more comprehensive insights. By leveraging Snowflake as the integration platform, organizations can perform complex transformations, aggregations, and analyses without imposing additional workload or latency on the production HP NonStop system. This approach ensures that critical operational systems remain performant and available, while still harnessing the full value of the data for downstream applications and stakeholders.
Similar connectors
Creating an Identity in HP NonStop SQL/MX
Creating a user (also known as an identity) in HP NonStop SQL/MX involves using the
CREATE USERSQL statement. Below are the steps and example SQL scripts to create a new user in the database. Please ensure you have the necessary administrative privileges.
Prerequisites
- You must be connected to the target SQL/MX catalog (database).
- You must have
MANAGE AUTHORIZATION
privilege or equivalent.
Step 1: Connect to the SQL/MX Database
Establish your SQL/MX client session using a tool such as
sqlci:
sqlci -u <admin_user>,<admin_password>
Step 2: Create the User
You can create both external and internal users, depending on your authentication and authorization requirements.
Example 1: Creating an Internal User
This creates a user whose authentication is managed by SQL/MX.
CREATE USER "PROD_DEVELOPER" IDENTIFIED BY "StrongPassword123";
"PROD_DEVELOPER"
: The new user’s name (subject to your site's naming convention)."StrongPassword123"
: The password for the new user.
Example 2: Creating an External User (Mapped to OS-level User/Group)
This maps an external identity (e.g., Guardian user ID or LDAP user) to a SQL/MX user.
CREATE USER "EXTERNAL_USER" FROM EXTERNAL "EXTERNAL_UID";
"EXTERNAL_USER"
: The SQL/MX user name (chosen for SQL/MX purposes)."EXTERNAL_UID"
: The external operating system user ID, group, or fully distinguished LDAP name, depending on your NonStop system’s authentication configuration.
Note: External user support and required syntax depend on your site’s security configuration. Check your system documentation for the exact format.
Step 3: Grant Privileges
After creating a user, grant the necessary privileges so the user can access or modify database objects.
GRANT SELECT, INSERT ON "MYSCHEMA"."MYTABLE" TO "PROD_DEVELOPER";
- Adjust schema, table, and privileges as needed.
Step 4: Verify User Creation
You can confirm the user exists with:
SHOW USERS;
or query the system metadata tables:
SELECT * FROM INFORMATION_SCHEMA.USERS WHERE USER_NAME = 'PROD_DEVELOPER';
Note:
- User account policies (such as password complexity and expiration) depend on site-specific security settings.
- The syntax and features described above apply to HP NonStop SQL/MX Release 3.3x and later. For earlier versions, or for advanced security environments, consult your system administrator or product documentation.
Installing the JDBC Driver
To connect Matillion Data Productivity Cloud to the HP NonStop SQL/MX database, you must manually acquire and install the appropriate JDBC driver, as it is not bundled with Matillion by default. This exclusion is due to licensing or redistribution restrictions specific to HP NonStop SQL/MX JDBC drivers.
Step 1: Download the JDBC Driver
- Navigate to the Hewlett Packard Enterprise (HPE) support portal: https://support.hpe.com/.
- Search for the HP NonStop SQL/MX JDBC driver. Ensure that you select the Type 4 JDBC driver, as this is the preferred choice for cloud and remote integrations due to its platform independence and direct connectivity.
- Download the correct driver version compatible with both your HP NonStop SQL/MX server version and the Java runtime environment on your Matillion Agent. You may need an HPE account or specific product credentials to access the download.
Step 2: Prepare the JDBC Driver for Upload
- Locate the downloaded JAR file(s) for the JDBC driver. If compressed (e.g., ZIP or TAR), extract it to access the
.jar
file. - Review the HPE documentation accompanying the driver for any prerequisites, such as licensing or additional files required.
Step 3: Install the JDBC Driver into Matillion Data Productivity Cloud
- Refer to Matillion's official documentation on uploading external JDBC drivers: https://docs.matillion.com/data-productivity-cloud/agent/docs/uploading-external-drivers/.
- Follow the outlined process, which typically includes:
- Uploading the driver’s JAR file through the Matillion Agent interface.
- Assigning the driver to the correct scope (e.g., agent- or project-level).
- Restarting or refreshing the agent if prompted to ensure the driver is recognized.
Step 4: Use the Driver in Matillion Pipelines
Once the JDBC driver has been successfully uploaded, you can configure your data connection within Matillion Data Productivity Cloud to access HP NonStop SQL/MX resources.
- For detailed instructions on configuring and using a new database source, consult Matillion's documentation: https://docs.matillion.com/data-productivity-cloud/designer/docs/database-query/.
- Provide the appropriate JDBC connection URL, authentication credentials, and other connection properties based on the requirements of your setup and the driver documentation.
Checking network connectivity
To enable successful connectivity between Matillion Data Productivity Cloud and your HP NonStop SQL/MX database, you must ensure that the database accepts incoming connections based on your Matillion deployment configuration:
-
Full SaaS agent configuration:
Allow incoming connections from the Matillion Data Productivity Cloud platform by enabling access from the IP addresses listed at:
https://docs.matillion.com/data-productivity-cloud/security/docs/allowing-ip-addresses/
Make sure all relevant addresses are added to your firewall, database access control lists, or network security groups as appropriate. -
Hybrid SaaS deployment:
Allow incoming connections from your own virtual private cloud (VPC) where your Matillion pipeline agent is running. To check connectivity and diagnose network access, you can use the utilities provided at:
https://exchange.matillion.com/data-productivity-cloud/pipeline/check-network-access/
Important:
If you are connecting to your HP NonStop SQL/MX database using a DNS hostname, ensure that the Full SaaS or Hybrid SaaS agent can successfully resolve the DNS address to the correct IP. Inaccessible or unresolvable DNS records will prevent successful connections.
Querying Data from HP NonStop SQL/MX Database
This guide provides technical instructions and examples for querying data from an HP NonStop SQL/MX database, including best practices for supporting ETL operations such as initial and incremental data loads, and information on datatype conversion with Snowflake.
Example SELECT Statements: HP NonStop SQL/MX
Below are example SQL statements you can use with HP NonStop SQL/MX. The syntax is standard ANSI SQL with certain NonStop-specific characteristics for identifiers and schemas.
``` -- Basic SELECT: retrieve all columns from a table SELECT * FROM my_schema.my_table;
-- Select specific columns with a WHERE filter SELECT customer_id, customer_name, created_date FROM retaildb.customers WHERE country_code = 'US';
-- Aggregate query with GROUP BY and HAVING SELECT region, COUNT() AS region_count FROM auditdb.events GROUP BY region HAVING COUNT() > 100; ```
Notes on SQL/MX
- Identifiers (table/column names): By default, names are stored in uppercase unless quoted, e.g.,
"MyTable"
preserves case. - Date and time arithmetic and functions generally follow the ANSI SQL standard, but consult HP NonStop SQL/MX documentation for exact details.
Datatype Conversion: HP NonStop SQL/MX and Snowflake
When moving data from HP NonStop SQL/MX to Snowflake, datatype conversion may be necessary, since the two systems have some differences in their datatype implementations. For example:
| HP NonStop SQL/MX | Snowflake Equivalent |
|---|---|
| CHAR(n)/VARCHAR(n) | VARCHAR(n) |
| NUMERIC(p, s)/DECIMAL | NUMBER(p, s) |
| INTEGER | NUMBER |
| TIMESTAMP | TIMESTAMP_NTZ |
| DATE | DATE |
| INTERVAL | Supported, but syntax and precision may differ |
- Ensure explicit casting in your SELECT queries if the native SQL/MX types do not map cleanly to Snowflake.
- For unusual or legacy datatypes, consider using
CAST()
in the query.
SELECT customer_id,
CAST(amount AS NUMERIC(18,2)),
CAST(order_date AS CHAR(10))
FROM orders;
Initial vs. Incremental Data Loads
A common ETL pattern is to use an initial, once-off full load followed by periodic incremental loads. You can use the Database Query component (for example, in Matillion ETL) for both scenarios. See details at Matillion Exchange: Incremental Load Strategy.
Initial Load
- Pattern: No filter clause, extracts all rows.
- Use: For populating a new table or generating a baseline snapshot.
SELECT * FROM transactions;
Incremental Load
- Pattern: Includes a filter clause that typically refers to a change-tracking column (such as update or insert timestamp, or an incrementing key).
- Use: For extracting only rows modified since the last load.
SELECT *
FROM transactions
WHERE modified_at > CAST('2024-06-01 00:00:00' AS TIMESTAMP);
Or if using an incrementing numeric ID:
SELECT * FROM transactions WHERE transaction_id > 123456;
- The same
Database Query
component and SQL interface can be reused, by parameterizing the filter value (e.g., via variables set by your ETL/ELT tool). - For more guidance, follow the link above and your organization’s CDC (Change Data Capture) strategy.
For more detail, see Matillion Knowledge Base: Database Query Component and HP NonStop SQL/MX documentation.
Data Integration Architecture
Loading data in advance of integration is a practical way to divide and conquer data management challenges, as it splits the process into two distinct steps: first, loading raw data into the Snowflake database, and second, performing the necessary transformations and integration tasks. This staged approach is a central advantage of the ELT (Extract, Load, Transform) architecture. With ELT, data transformation—a key requirement for successful integration—is most effectively achieved through data transformation pipelines, which allow for complex, repeatable, and automated processing of source data. Furthermore, a significant benefit of the ELT model is that all transformation and integration occurs within the target Snowflake environment. This leverages Snowflake’s scalable compute capabilities, enabling transformations to be executed quickly, on-demand, and as resources are required, without the need for separate or costly external data processing infrastructure.