Using the HubSpot Query component in Matillion ETL for Amazon Redshift
Matillion uses the Extract-Load-Transform (ELT) approach to deliver quick results for a wide range of data processing purposes, including everything from customer behaviour analytics, financial analysis, and even reducing the cost of synthesising DNA. The HubSpot Query component in Matillion ETL for Amazon Redshift presents an easy-to-use graphical interface, enabling you to connect to HubSpot and pull data into Amazon Redshift.
Customers typically use the component to measure the effectiveness of their marketing by combining HubSpot data with other data sources, such as sales data, to get a holistic view.
This self-contained connector is included within the scope of a standard Matillion license, so there are no additional software installations required or associated costs.
Watch our tutorial video for a demonstration on how to set up and use the HubSpot Query component in Matillion ETL for Amazon Redshift.
The first step in configuring the HubSpot Query component in Matillion ETL for Amazon Redshift is to provide Matillion with Authentication to HubSpot.
The Matillion HubSpot component requires OAuth to be setup to authenticate access. Clicking on the 3 dots next to the Authentication property will bring a pop up box showing all available HubSpot OAuth set up in Matillion:
Next, identify what data is to be loaded into Amazon Redshift from the Data Source drop down.
In the Data Selection area choose the required fields from the data source. An example list of the fields available in the Data Source object selected, including any custom objects, is in the image below. This will form the new table which is created in Amazon Redshift.
Data Source Filter
Use a filter to filter the returned data if required.
Running the HubSpot Query
Before you run the component, give the Target Table a name. This is the name of the new table which is created to write the data into in Amazon Redshift. Additionally, specify a S3 Staging Area; this is a S3 bucket which will temporarily store the results of the query before loading it into Amazon Redshift.
The HubSpot Query component also has a Limit property, which can be used to force an upper limit on the number of records returned.
Once configured correctly, the border on the HubSpot icon will turn green.
To query your data and bring it into Amazon Redshift run the Orchestration job manually or by using the Scheduler.
The HubSpot Query component offers an “Advanced” mode instead of the default “Basic” mode.
In Advanced Mode, you can write a SQL-like query over all the available fields in the data model. This is automatically translated into the correct API calls to retrieve the data requested.
Transforming the Data
Once the required HubSpot data is in Amazon Redshift you can perform transformation jobs, such as combining with ERP data.
In this way, you can build out the rest of your downstream transformations and analysis, thus taking advantage of Amazon Redshift’s power and scalability.