Integrate data from QuestDB to Amazon Redshift using Matillion

Our QuestDB to Redshift connector seamlessly transfers your data to Redshift within minutes, ensuring it remains up-to-date without requiring you to write code or manage complex ETL scripts.

QuestDB
Amazon Redshift
QuestDB to Amazon Redshift banner

Extracting data from QuestDB to Amazon Redshift

Extracting data from QuestDB is a crucial step for organizations seeking to integrate real-time analytics and time-series data into their broader data ecosystems. Amazon Redshift, with its robust data warehousing capabilities, is a popular destination for such data, enabling advanced analysis and reporting across a wide range of use cases. This article provides a step-by-step guide on moving data from QuestDB to Redshift. We’ll begin by covering how to create an identity within QuestDB, ensuring secure and consistent access. For users leveraging Matillion for ETL jobs, we will review how to verify and if necessary, obtain the appropriate JDBC driver for QuestDB connectivity. We’ll also address the key consideration of network connectivity between source and target environments to ensure reliable data transfer. Finally, the article will demonstrate how to query data from QuestDB—both for an initial bulk load and for ongoing incremental updates—laying the foundation for an effective and repeatable data extraction workflow.


What is QuestDB?

QuestDB is a high-performance, open-source time series database designed for demanding real-time analytics on large volumes of time-stamped data. Written in Java, it employs a column-oriented storage engine and leverages memory-mapped I/O to enable ultrafast ingestion rates—often exceeding millions of records per second—while supporting full SQL syntax including JOINs and complex aggregations. QuestDB features native support for popular protocols such as InfluxDB line protocol, PostgreSQL wire protocol, and REST, facilitating integration with a wide variety of data sources and visualization tools. Its zero-GC architecture and vectorized query execution provide low latency and efficient resource utilization, making it ideal for use cases spanning financial services, IoT, and observability.

matillion logo x QuestDB

What is Amazon Redshift?

Amazon Redshift is a fully managed, cloud-based data warehouse designed for OLAP workloads. Based on PostgreSQL, it adds columnar storage, parallel query execution, and advanced compression for fast complex queries on large datasets. Redshift integrates with AWS services like S3 and Glue, and business intelligence tools. Clusters are easily scalable for both performance and cost. Features like concurrency scaling and Spectrum allow dynamic resources and querying of exabyte-scale S3 data. Security is strong, with encryption at rest and in transit, VPC network isolation, and detailed identity and access controls.

Why Move Data from QuestDB into Amazon Redshift

Unlocking Advanced Analytics: Copying Data from QuestDB to Redshift

A data engineer or architect may seek to copy data from QuestDB to Amazon Redshift for several compelling reasons. First, QuestDB often stores time-series data or other valuable datasets that can provide significant business insights when analyzed appropriately. However, the true value of this data is often realized when it is integrated with information from additional sources, enabling more comprehensive analytics and richer decision-making. By offloading the integration and complex analytic workloads onto Redshift—a robust, scalable data warehouse—the operational burden on QuestDB is minimized. This ensures that QuestDB can maintain its performance for data ingestion and real-time querying, while Redshift handles heavy data transformations, complex joins, and large-scale analysis, facilitating efficient and effective utilization of all available data.

Creating a User Identity in QuestDB

QuestDB, as of version 7.x (and current preview versions), does not support granular, SQL-based user management such as

CREATE USER
or
GRANT
. QuestDB uses a simplified security model centered around database instance access and, optionally, authentication via HTTP or PostgreSQL wire protocol.

Below are instructions to secure access and, where relevant, configure authentication for different interfaces.


1. HTTP Authentication (REST and Web Console)

You can enable HTTP Basic Authentication for the web console and REST API endpoints.

Steps:

  1. Edit the configuration file (

    conf/server.conf
    or via environment variables).

  2. Set HTTP authentication to true:

    ini
        http.security.readonly=false

  3. Set the authentication type to Basic:

    ini
        http.security.authentication=basic

  4. Define the username and password:

You can supply these via environment variables or configuration file keys:

ini
    http.security.username=myuser
    http.security.password=mypassword
In a Docker environment, use (for example):
bash
    -e QDB_HTTP_SECURITY_AUTHENTICATION=basic
    -e QDB_HTTP_SECURITY_USERNAME=myuser
    -e QDB_HTTP_SECURITY_PASSWORD=mypassword

Result: The HTTP endpoints now require authentication using these credentials.


2. PostgreSQL Wire Protocol Authentication

QuestDB supports connecting via the PostgreSQL protocol. You can enforce user authentication for this interface.

Steps:

  1. Edit the configuration file or use environment variables:

ini
    pg.enabled=true
    pg.security.authentication=true
    pg.security.username=myuser
    pg.security.password=mypassword

  1. Optionally, set the binding address and port:

    ini
        pg.factory.port=8812

  2. Restart QuestDB so configuration changes take effect.

Result:
Attempting to connect with the PostgreSQL wire protocol (

psql
, JDBC, etc.) will now require the supplied username and password.


3. Access Control Notes

  • At present, all configured users have the same level of privileges (full database access).
  • There is no support for per-table, per-database, or schema-level authorization.
  • Disabling or securing the database at the network level (firewall, container networks) is recommended for further security.

Example: Environment Variable Setup (Docker)

bash
docker run -p 9000:9000 -p 8812:8812 \
  -e QDB_HTTP_SECURITY_AUTHENTICATION=basic \
  -e QDB_HTTP_SECURITY_USERNAME=admin \
  -e QDB_HTTP_SECURITY_PASSWORD=secretpassword \
  -e QDB_PG_SECURITY_AUTHENTICATION=true \
  -e QDB_PG_SECURITY_USERNAME=admin \
  -e QDB_PG_SECURITY_PASSWORD=secretpassword \
  questdb/questdb


Reference

See the QuestDB configuration documentation for details and further options regarding authentication and security.

Installing the JDBC Driver

At the time of writing, the QuestDB JDBC driver is not bundled by default with Matillion Data Productivity Cloud. This is mainly due to licensing or redistribution restrictions, so you must manually obtain and install the required driver before connecting Matillion to your QuestDB instance.

1. Download the QuestDB JDBC Driver
Visit the official QuestDB documentation at https://questdb.io/docs/reference/api/jdbc/ to locate and download the JDBC driver. When available, select the Type 4 JDBC driver as it is the most suitable for modern environments and does not require any client-side libraries apart from the driver itself.

2. Upload the JDBC Driver to Matillion Data Productivity Cloud
To make the driver available to your Matillion environment, you need to upload the downloaded

.jar
file using the agent interface. Matillion provides full step-by-step instructions on how to upload and configure third-party JDBC drivers at this URL:
https://docs.matillion.com/data-productivity-cloud/agent/docs/uploading-external-drivers/

3. Configure and Use the Driver
Once the driver has been uploaded and Matillion has been configured to recognize it, you can use the driver to create connections to your QuestDB instance for querying and data actions. Guidance on how to set up and use JDBC drivers within Matillion Data Productivity Cloud can be found here:
https://docs.matillion.com/data-productivity-cloud/designer/docs/database-query/

Refer to both the QuestDB and Matillion documentation for the most up-to-date compatibility advice and further configuration options.

Checking network connectivity

To connect Matillion Data Productivity Cloud to your QuestDB database, you must ensure that QuestDB allows incoming network connections from the appropriate sources, depending on your deployment configuration:

Additionally, if you are using a DNS hostname to reference your QuestDB database (rather than a raw IP address), ensure that the Full SaaS agent or your Hybrid SaaS agent can resolve the DNS address. If DNS resolution fails, the Matillion service will not be able to establish a connection to your QuestDB instance.

Querying Data from QuestDB

This guide explains how to query data from a QuestDB database using SQL, addresses type conversion considerations between QuestDB and Redshift, and demonstrates recommended patterns for loading data.


Basic SQL SELECT Queries in QuestDB

QuestDB supports SQL queries familiar to users of relational databases. Here are some query examples:

``` -- Select all columns and rows from a table SELECT * FROM trades;

-- Select specific columns SELECT timestamp, price, volume FROM trades;

-- Time-based filtering with QuestDB's timestamp syntax SELECT * FROM trades WHERE timestamp >= '2024-06-01T00:00:00Z';

-- Aggregate data SELECT symbol, COUNT(*), AVG(price) FROM trades GROUP BY symbol; ```


Handling Datatype Conversion (QuestDB ↔ Redshift)

When moving data between QuestDB and Amazon Redshift, you may encounter datatype differences. For example:

QuestDB Type Closest Redshift Type
LONG
BIGINT
DATE
DATE
TIMESTAMP
TIMESTAMPTZ
SYMBOL
,
STRING
VARCHAR
DOUBLE
DOUBLE PRECISION
BOOLEAN
BOOLEAN

Ensure that mappings are explicitly handled in your ETL or querying framework to prevent data mismatches or truncation.


Best Pattern: Initial Load and Incremental Loads

To efficiently replicate or migrate data from QuestDB to Redshift (or any destination), follow a two-phase strategy:

  1. Initial Load: Extract all data.
  2. Incremental Loads: Periodically extract new or updated records.

Both use the same Database Query component, such as in ETL tools (e.g., Matillion).

Initial Load Example

Perform the initial load without any filter clause:

SELECT * FROM trades;

This statement fetches the entire dataset for the first ingestion.

Incremental Load Example

Subsequent loads should filter data based on a watermarked column, typically a timestamp or an incrementing ID:

SELECT * FROM trades
WHERE timestamp > '2024-06-10T00:00:00Z';

In the above query, replace

'2024-06-10T00:00:00Z'
with the last-loaded timestamp value.

Tip: For a deeper understanding of incremental load strategies with ETL tools, see Incremental Load: Data Replication Strategy (Matillion).


With this approach, you can efficiently manage and query time-series or tabular data from QuestDB databases, supporting both bulk transfers and low-latency incremental updates to downstream systems like Redshift.

Data Integration Architecture

Loading data in advance of integration, as practiced in the ELT (Extract, Load, Transform) architecture, offers a strategic "divide and conquer" approach by separating the process into two distinct steps: loading data into the target system, and then transforming and integrating it within that environment. Data integration ultimately relies on robust data transformation, and the most effective means to achieve this is through dedicated data transformation pipelines. One of the major advantages of the ELT architecture is that both transformation and integration activities are executed directly inside the target Redshift database. This not only accelerates processing and enables on-demand workflows, but also scales efficiently with Redshift’s compute resources—removing the need to provision or pay for additional external data processing infrastructure.

Get started today

Matillion's comprehensive data pipeline platform offers more than point solutions.