Integrate data from Raima Database Manager (RDM) to Amazon Redshift using Matillion

The Raima Database Manager (RDM) to Redshift connector enables seamless and efficient data transfers to Redshift, ensuring your information remains current without the need for manual coding or intricate ETL processes.

Raima Database Manager (RDM)
Amazon Redshift
Raima Database Manager (RDM) to Amazon Redshift banner

Extracting data from Raima Database Manager (RDM) to Amazon Redshift

Extracting data from Raima Database Manager (RDM) is an essential step for organizations wishing to leverage the analytical power of Amazon Redshift. Whether your goal is centralized reporting, advanced analytics, or simply data modernization, a well-designed extraction and loading process can ensure seamless integration between these two platforms. This article will guide you through each major stage of the process. We will begin by explaining how to create an identity within RDM, ensuring proper authentication and access management. For those using Matillion, we will discuss how to check for—and, if necessary, acquire—the appropriate JDBC driver to facilitate connectivity. Next, we will review the critical aspects of network connectivity and security between your RDM source and Redshift target environments. Finally, we will cover the practicalities of querying and extracting data, both for an initial data load and for subsequent incremental updates. By following this guide, readers will gain a comprehensive understanding of the tools and processes required to efficiently extract data from Raima Database Manager and load it into Amazon Redshift.


What is Raima Database Manager (RDM)?

Raima Database Manager (RDM) is a high-performance, embedded database engine optimized for real-time, resource-constrained environments like industrial automation, telecommunications, and IoT. Supporting both relational and network models, RDM combines data integrity with fast, flexible navigation. Written in portable C, it runs on diverse OSs and hardware. Features include ACID transactions, multi-user access, a flexible API, and compact footprint, making it ideal for applications needing reliable, local data management without the bulk of traditional client-server databases. Cross-platform compatibility and minimal overhead further contribute to its popularity in systems demanding predictable performance and efficiency.

matillion logo x Raima Database Manager (RDM)

What is Amazon Redshift?

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud, designed to enable fast and complex querying of large datasets using standard SQL. Built on top of PostgreSQL, Redshift optimizes analytical workloads through columnar storage, data compression, and massively parallel processing (MPP) architecture. It integrates seamlessly with AWS services, supports high availability through automatic replication, and offers features such as Redshift Spectrum for direct querying of data in Amazon S3. With automatic scaling, workload management, and a pay-as-you-go pricing model, Redshift facilitates efficient, scalable, and secure data warehousing for a wide range of enterprise use cases.

Why Move Data from Raima Database Manager (RDM) into Amazon Redshift

Unlocking Analytics: Copying Data from Raima Database Manager (RDM) to Redshift

There are several compelling reasons why a data engineer or architect might choose to copy data from Raima Database Manager (RDM) into Amazon Redshift. First, RDM often contains operational data that can be of significant value when analyzed or combined with data from other business systems. However, the true potential of this data is realized when it is integrated with information from diverse sources, such as customer management systems, analytics platforms, and external data feeds. By importing RDM data into Redshift—a scalable, cloud-based data warehouse—organizations can perform complex analyses and reporting tasks across unified datasets. Importantly, leveraging Redshift for these purposes helps to avoid imposing additional processing loads on the primary RDM environment, thereby preserving its performance for transactional or embedded applications while still unlocking the value of the data through powerful analytics and integration capabilities.

Creating a User in Raima Database Manager (RDM)

To manage access to your Raima Database Manager (RDM) database, you can create users with specific authentication credentials. User and authentication management is supported via SQL commands. Below are step-by-step instructions and an example SQL script for creating a new user identity in an RDM database.

Prerequisites

  • You must have a database created and running.
  • You need to be connected to RDM with a user account that has sufficient privileges (typically an administrative user).

Instructions

  1. Connect to the RDM SQL engine using the provided client (such as

    sqlcli
    or another interface compatible with RDM SQL).

  2. Create a user by executing the

    CREATE USER
    SQL statement. Specify the username and the corresponding password.

CREATE USER username IDENTIFIED BY 'password';

  • Replace
    username
    with the desired username (e.g.,
    appuser
    ).
  • Replace

    'password'
    with a secure password of your choice.

  • Grant privileges if the user needs access to database objects or specific actions, for example:

GRANT SELECT, INSERT ON mytable TO username;

  • Adjust the privileges (
    SELECT
    ,
    INSERT
    , etc.) and table name (
    mytable
    ) as required.

Example

Assume you want to create a user called

appuser
with password
StrongP@ssw0rd
:

CREATE USER appuser IDENTIFIED BY 'StrongP@ssw0rd';

Optionally, to allow

appuser
to read and insert data on a table named
customer
:

GRANT SELECT, INSERT ON customer TO appuser;

Tips

  • Passwords are case sensitive and should follow your organization's security policies.
  • You may need to commit the transaction to apply changes, depending on your session settings:

COMMIT;

  • Review RDM documentation for details on available user management and privilege commands, as implementations may vary slightly between versions.

Installing the JDBC Driver

The Raima Database Manager (RDM) JDBC driver enables connectivity between the RDM database and third-party tools, such as Matillion Data Productivity Cloud. Please note that, at the time of writing, the RDM JDBC driver is not bundled with Matillion Data Productivity Cloud because of specific licensing or redistribution restrictions. Therefore, the driver must be downloaded and installed manually.

To successfully enable integration, follow the steps outlined below:

1. Download the JDBC Driver

  1. Visit the official Raima Database Manager product page at https://www.raima.com/products/rdm.
  2. On the downloads section of the page, locate the JDBC driver suitable for your platform.
  3. Preferably, select the Type 4 JDBC driver. Type 4 drivers are implemented in pure Java and are typically the most compatible option when integrating with cloud-based services such as Matillion Data Productivity Cloud.

2. Review the Licensing Terms

Before proceeding, ensure you have reviewed and complied with any licensing requirements associated with the RDM JDBC driver. Due to redistribution or licensing restrictions, distribution or automated download may not be permitted, which is why this manual step is required.

3. Install the JDBC Driver in Matillion Data Productivity Cloud

  1. Once you have obtained the driver JAR file, follow the official Matillion instructions for uploading external JDBC drivers:
  2. Refer to the guide at https://docs.matillion.com/data-productivity-cloud/agent/docs/uploading-external-drivers/.
  3. This process generally involves accessing your Matillion environment, navigating to the agent or runtime environment, and uploading the JDBC JAR file to the correct directory.

4. Configure and Use the Driver

  • After the JDBC driver is installed, you can configure connections to your RDM database via Matillion Data Productivity Cloud.
  • Detailed setup and usage instructions can be found in the official Matillion documentation: https://docs.matillion.com/data-productivity-cloud/designer/docs/database-query/.
  • This guide covers how to create data source configurations and leverage the JDBC driver in your workflows.

Checking network connectivity

To enable connectivity between Matillion Data Productivity Cloud and your Raima Database Manager (RDM) database, you must ensure that the RDM database permits incoming network connections as follows:

Additionally, if you are referencing the RDM database using a DNS hostname, the Full SaaS or Hybrid SaaS agent must be able to resolve this address. Ensure that the relevant DNS records are accessible from the network where your selected deployment is running.

Querying Data from a Raima Database Manager (RDM) Database

This guide explains how to query data from a Raima Database Manager (RDM) database, with practical SQL examples, notes on datatype conversion (particularly in Redshift integrations), and effective strategies for both initial and incremental data load processes.


1. Example RDM SQL SELECT Statements

RDM supports a subset of the SQL standard for querying data. Here are a few examples:

``` -- Selecting all columns from a table called "customers" SELECT * FROM customers;

-- Selecting specific columns SELECT customer_id, first_name, last_name FROM customers;

-- Applying a WHERE filter SELECT * FROM orders WHERE order_date >= '2024-01-01';

-- Using ORDER BY and LIMIT (if supported by your RDM version) SELECT * FROM products ORDER BY price DESC LIMIT 10; ```

Note: RDM’s SQL dialect may not support all standard SQL functions. Refer to your RDM documentation for precise syntax and supported functions/operators.


2. Datatype Conversion: RDM to Redshift

When integrating RDM with Amazon Redshift or other analytics platforms, datatype conversion is required, as not all RDM datatypes have a direct counterpart in Redshift:

RDM Datatype Recommended Redshift Datatype
CHAR(n), VARCHAR VARCHAR(n)
SMALLINT SMALLINT
INTEGER INTEGER
FLOAT, DOUBLE FLOAT8
DATE, TIME TIMESTAMP
BINARY, BLOB VARBYTE

Tip: Always validate your ETL process to ensure correct casting and rounding, particularly for dates and floating-point types.


3. Loading Patterns: Initial and Incremental Loads

The recommended data integration pattern is:

  • Initial Load: Perform a once-off bulk import of all available data from RDM.
  • Incremental Loads: Regularly transfer only new or updated rows.

The same Database Query component can be reused for both, with a key difference: the use of a

WHERE
filter for incremental loads.

a. Initial Load Query

  • No additional filter is required; fetch all rows.

SELECT * FROM customers;

Or, for specific columns:

SELECT customer_id, first_name, last_name, last_modified FROM customers;

b. Incremental Load Query

  • Use a
    WHERE
    clause to fetch only rows added/changed since the last extraction, often based on a timestamp or auto-incrementing key.

SELECT * FROM customers WHERE last_modified > '2024-06-01 00:00:00';

This incremental loading strategy minimizes data movement and improves performance.

Read more: Incremental-Load Data Replication Strategy (Matillion Exchange)


4. Using the Database Query Component

  • Initial Load: Leave the filter blank in the Database Query component.
  • Incremental Load: Configure the component with an appropriate filter clause, usually referencing a stored "last loaded" value.

```ell

Pseudo-configuration example:

query: "SELECT * FROM customers WHERE last_modified > ?" ```

Replace the question mark with a parameter that is set to your last successful load's timestamp.


By following this pattern, you can efficiently query and synchronize data between RDM and analytics destinations like Redshift.

Data Integration Architecture

Loading data into Amazon Redshift prior to integrating it is an effective way to address data integration challenges by breaking the task into two sequential steps. This approach, a hallmark of the ELT (Extract, Load, Transform) architecture, enables organizations to "divide and conquer" by first loading raw data efficiently, and then performing integration as a separate process. Data integration inherently involves data transformation, which is best handled using data transformation pipelines that facilitate consistency, manageability, and automation of complex transformations. A further significant advantage of the ELT architecture is that data transformation and integration tasks occur directly within the target Redshift database itself. This in-database processing delivers high performance due to parallel execution, enables on-demand scalability, and eliminates the need and expense of separate data processing infrastructure, as all transformations are executed elastically within Redshift’s managed environment.

Get started today

Matillion's comprehensive data pipeline platform offers more than point solutions.