The Agentic Advantage series: Real-world AI success stories from leading CDAOs

Register Now

Integrate data from Progress OpenEdge RDBMS to Snowflake using Matillion

Our Progress OpenEdge RDBMS to Snowflake connector enables seamless and up-to-date data transfer to Snowflake within minutes, eliminating the need for manual coding or complex ETL processes.

Progress OpenEdge RDBMS
Snowflake
Progress OpenEdge RDBMS to Snowflake banner

Extracting data from Progress OpenEdge RDBMS to Snowflake

Extracting data from Progress OpenEdge RDBMS is a critical step for organizations seeking to leverage their operational data within modern cloud data platforms like Snowflake. Achieving a seamless integration requires careful planning, a clear understanding of both systems, and attention to details such as user access, connectivity, and data consistency. In this article, we will guide you through the process of extracting data from Progress OpenEdge RDBMS and loading it into Snowflake efficiently. Our coverage will include establishing the necessary user identity in the Progress OpenEdge database, ensuring you have the correct JDBC driver (particularly for users of Matillion ETL), verifying and configuring the appropriate network connectivity between your source and target environments, and finally, strategies for querying data both for the initial full extraction and for ongoing incremental updates. Whether you're moving a few tables or architecting a regular data pipeline, this article aims to provide the essential steps and considerations for a successful data transfer.


What is Progress OpenEdge RDBMS?

The Progress OpenEdge RDBMS is a high-performance, enterprise-grade relational database designed for scalability, reliability, and seamless integration with transactional business applications. It supports multi-tenant architectures, advanced data security, and robust transaction control in high-concurrency environments. Closely tied to the Progress OpenEdge ABL, it enables streamlined application development. Key features include online schema changes, replication, flexible backup and recovery, and efficient indexing, making it ideal for on-premises or cloud deployments. Industries choose OpenEdge RDBMS for mission-critical use due to its stability, 24/7 operational support, scalability, and low administrative overhead.

matillion logo x Progress OpenEdge RDBMS

What is Snowflake?

Snowflake is a cloud-native data platform that solves performance, scalability, and management issues in traditional databases for big data and analytics. Its architecture separates compute and storage for independent scaling and efficient resource use. With a multi-cluster shared data design, multiple users can access the same data simultaneously without conflicts. Supporting both structured and semi-structured data, Snowflake allows easy ingestion and querying using standard SQL. As a platform-as-a-service (PaaS), it automates replication, tuning, security, and infrastructure management, providing organizations with a flexible, resilient solution for data warehousing, lakes, and sharing across public clouds.

Why Move Data from Progress OpenEdge RDBMS into Snowflake

Unlocking Analytics: The Case for Replicating Progress OpenEdge Data into Snowflake

Data engineers and architects may wish to copy data from a Progress OpenEdge RDBMS into Snowflake for several compelling reasons. First, the Progress OpenEdge RDBMS often stores transactional and operational data that can provide significant value when analyzed; however, this value is frequently locked within the system unless it is integrated with data from other sources. By consolidating data from Progress OpenEdge RDBMS alongside information from additional source systems in Snowflake, organizations can uncover deeper business insights, perform comprehensive analytics, and support decision-making processes more effectively. Additionally, carrying out data integration and processing workloads directly in Snowflake, rather than on the production Progress OpenEdge RDBMS, helps avoid unnecessary strain on the operational database, thereby preserving its performance and stability for day-to-day business operations. This architecture ensures that analytical workloads do not compete with transactional processing, leading to improved reliability and scalability.

Creating a User in Progress OpenEdge RDBMS

Progress OpenEdge uses an authentication model that relies on operating system users or application-level authentication, rather than managing users directly within the database system as in some other RDBMS platforms. However, you can control access at the database level by creating database users (aliases for user accounts) and managing their permissions using SQL.

The instructions below illustrate how to create a user identity and configure access permissions in an OpenEdge environment, primarily via the SQL interface.

1. Add a User to the Database

In OpenEdge, user IDs (usernames) and, optionally, passwords are registered with the database security system using the

GRANT
statement.

  1. Open a SQL command-line tool, such as
    sqlexp
    or use SQL clients capable of connecting to OpenEdge (e.g., DBeaver, SQuirreL).
  2. Connect to your OpenEdge database as a user with DBA privileges.

  3. Register a new user and set permissions

OpenEdge doesn't have a traditional CREATE USER statement; instead, you define user identities (matching OS or application users) with permissions:

-- Grant CONNECT privilege to a specified user ID:
   GRANT CONNECT TO '<user_id>';

For example:

GRANT CONNECT TO 'jdoe';

This command allows the user

jdoe
to connect to the database via SQL access.

  1. Grant Additional Privileges to the User

You can grant further privileges on schemas, tables, or other objects, for example:

-- Grant SELECT privilege on a table to the new user:
   GRANT SELECT ON pub.my_table TO 'jdoe';

-- Grant all privileges on a table to the user:
   GRANT ALL PRIVILEGES ON pub.my_table TO 'jdoe';

2. Set Up Authentication

OpenEdge SQL Authentication typically validates user IDs against:

  • The operating system (if using OS authentication).
  • The list of users stored in the authorization table via defined grants.
  • Application logic for further authentication if you build it within your ABL application or security layer.

Note: You do not set a password with SQL—identify the user by their user ID (

_User table
) and manage any authentication requirements outside of SQL.

3. Verify the User

You can query the database authorization table for existing users:

-- List all users allowed to connect:
SELECT * FROM sysprogress.SYSDBAUTH;

4. Revoking Access

If you need to remove a user's access:

-- Revoke CONNECT privilege:
REVOKE CONNECT FROM 'jdoe';


References:
- Progress OpenEdge Database Management: SQL Reference
- OpenEdge Security and Authentication Guide

Installing the JDBC driver

At the time of writing, the JDBC driver for the Progress OpenEdge RDBMS is not included by default in Matillion Data Productivity Cloud. This absence is due to licensing or redistribution restrictions, which require end-users to download and manage the JDBC driver installation manually.

To install and use the Progress OpenEdge JDBC driver with Matillion Data Productivity Cloud, please follow these instructions:

1. Download the JDBC Driver

The Progress OpenEdge JDBC driver can be downloaded directly from Progress Software’s official website. Go to https://www.progress.com/openedge/downloads and locate the JDBC driver suitable for your platform. Ensure that you select the Type 4 JDBC driver, which is a pure Java implementation recommended for most integrations due to its platform independence and performance.

Note: Access to downloads may require you to create an account or log in on the Progress site.

2. Install the JDBC Driver into Matillion

After successfully downloading the JDBC driver (usually distributed in a

.jar
file), you'll need to upload and register the driver with your Matillion Data Productivity Cloud instance. This process enables Matillion to interface with Progress OpenEdge databases via JDBC.

Detailed, step-by-step instructions for uploading external JDBC drivers are provided in the Matillion documentation: https://docs.matillion.com/data-productivity-cloud/agent/docs/uploading-external-drivers/

Follow these steps in the linked documentation to upload the downloaded Progress OpenEdge JDBC

.jar
file to your Matillion environment.

3. Configure and Use the Driver

Once the driver is installed, you can configure your database connections and begin using Progress OpenEdge data sources in Matillion. For instructions on how to set up and use database queries within the Matillion Data Productivity Cloud environment, consult the official usage guide: https://docs.matillion.com/data-productivity-cloud/designer/docs/database-query/

This documentation provides information on configuring deployment environments, specifying connection properties, and utilizing JDBC data sources for integrations and queries within the Matillion platform.

Checking network connectivity

To ensure a successful connection between Matillion Data Productivity Cloud and your Progress OpenEdge RDBMS database, you must verify that your database is configured to allow incoming network connections based on your deployment type:

  • Full SaaS agent configuration:
    Make sure your Progress OpenEdge RDBMS database allows incoming connections from the IP addresses used by Matillion's cloud services. The current list of required IP addresses is maintained at this page in the Matillion documentation. Please review and update your firewall or network allowlist as necessary.

  • Hybrid SaaS deployment:
    In this setup, incoming database connections are established from within your own virtual private cloud (VPC). You must configure your database to accept connections from the relevant internal network locations within your VPC. If you need to check or verify accessible network paths, utilities are available at Matillion Exchange to help diagnose connectivity.

Additionally, if your Progress OpenEdge RDBMS database is referenced using a DNS hostname, the Full SaaS or Hybrid SaaS agent must be able to resolve this hostname to the correct IP address. Ensure that all required DNS records are correctly set up and accessible to the relevant agents.

Querying Data from a Progress OpenEdge RDBMS Database

This guide explains how to retrieve data from a Progress OpenEdge RDBMS database using SQL queries, handle potential datatype conversions to Snowflake, and design efficient data loading strategies using the Database Query component.


Example SQL SELECT Statements

While Progress OpenEdge databases can be queried using their own 4GL/ABL, ODBC-compliant SQL access is also commonly used, especially for integrations and ETL processes. Below are example SELECT statements:

``` -- Retrieve all columns from the "customer" table SELECT * FROM customer;

-- Retrieve specific fields: Customer ID, Name, and City SELECT custnum, name, city FROM customer;

-- Retrieve orders placed after a certain date SELECT ordernum, orderdate, amount FROM orders WHERE orderdate > '2023-01-01';

-- Aggregate sales by customer SELECT custnum, SUM(amount) AS total_sales FROM orders GROUP BY custnum ORDER BY total_sales DESC; ```

Note: Table and column names in Progress OpenEdge may use case sensitivity settings and sometimes require double quotes for identifiers. Refer to your schema when constructing queries.


Datatype Conversion Considerations

When transferring data from Progress OpenEdge to Snowflake, be aware of datatype mapping discrepancies. Some common Progress OpenEdge to Snowflake conversions include:

Progress OpenEdge Typical Snowflake Mapping
INTEGER NUMBER
CHARACTER VARCHAR
DATE DATE
DECIMAL / NUMERIC NUMBER or DECIMAL
LOGICAL BOOLEAN
DATETIME TIMESTAMP_NTZ
  • Best Practice: Review and adjust column definitions in Snowflake to accommodate source Progress OpenEdge datatypes, especially for CHARACTER fields (length/Unicode) and DECIMAL precision.

Incremental Load Strategy with the Database Query Component

The recommended loading pattern is as follows:

  1. Initial Load (Once-Off):

    • Pull all necessary data from the source table.
    • Use the Database Query component without a filter clause.
    • Example:
      SELECT * FROM orders;
    • All records will be transferred for the first synchronization.
  2. Incremental Load (Ongoing):

    • After the initial load, retrieve only newly created or updated records.
    • Use the Database Query component with a filter clause to select only changed data since the last load.
    • Example: Incremental by
      orderdate
      or
      updated_at
      timestamp
      SELECT * FROM orders
        WHERE orderdate > ?
    • The filter value (
      ?
      ) is typically set dynamically using a variable populated with the timestamp or key value from the last successful load.
  3. Read more about this approach: Incremental Load & Data Replication Strategy on Matillion Exchange

Tip: Using the same Database Query component configuration for both load types simplifies maintenance and enables consistent data extraction processes. The only change between full and incremental loads is the WHERE filter condition.


By following the above practices and adjusting for datatype mapping, you can reliably migrate and synchronize data from Progress OpenEdge RDBMS to your analytics destination such as Snowflake.

Data Integration Architecture

One of the main benefits of the ELT (Extract, Load, Transform) architecture is its ability to divide and conquer the data integration process by separating it into two distinct steps: first, loading the raw data into the Snowflake database, and then performing any necessary data transformation and integration as a second step within the database itself. This separation allows organizations to leverage highly efficient data transformation pipelines, which are specifically designed for restructuring, cleaning, and enriching data during the integration process. By executing the transformation and integration natively inside the target Snowflake environment, organizations benefit from Snowflake's high performance, scalability, and on-demand processing capabilities. This approach also eliminates the need for external data processing infrastructure, leading to reduced operational complexity and cost, while fully utilizing the power and flexibility of Snowflake's computing resources.

Get started today

Matillion's comprehensive data pipeline platform offers more than point solutions.