Using the SAP Netweaver Query component in Matillion ETL for Amazon Redshift
Matillion uses the Extract-Load-Transform (ELT) approach to deliver quick results for a wide range of data processing purposes: everything from customer behavior analytics, financial analysis, and even reducing the cost of synthesizing DNA.
The SAP Netweaver Query component in Matillion ETL for Amazon Redshift presents an easy-to-use graphical interface, enabling you to connect to SAP Netweaver and pull tables from there into your Amazon Redshift data warehouse. Many of our customers are using this component to bring SAP Netweaver data into Amazon Redshift to combine with other data, as well as, enable aggregating and reporting on data in Netweaver.
The connector is completely self-contained: no additional software installation is required. It’s within the scope of an ordinary Matillion license, so there is no additional cost for using the features.
The first step in configuring the SAP Netweaver Query component is to provide the host details to Matillion ETL. This will allow Matillion to connect to your SAP Netweaver. This is usually the IP address of the SAP Netweaver Instance’s Application Server. Clicking on the 3 dots next to the server property on your Matillion canvas will bring a pop-up box where you can enter the Server IP.
User and Password
Next, you will need to give authenticated details of your SAP Netweaver User and Password to access the data you require:
This is the code of the SAP Netweaver client authenticating to the SAP system:
Now you can choose the data that you want to load into Amazon Redshift from the Data Source drop down. This is a list of the some of the tables available in your Netweaver:
Please note that as SAP Netweaver ships with over 100,000 tables. Not all of these tables will be available in the drop-down here. To add more Data Sources please see our SAP Netweaver support article.
After you chose the data source(s), next, select the required fields from the data source in the Data Selection. This is a list of the columns available to you in the Data Source previously selected. This will form the new table which you will create in Amazon Redshift with the help of Matillion.
Data Source Filter
If required, you can add a filter to limit or refine the returned data. This filter will run as the WHERE clause in SAP Netweaver:
Running the SAP Netweaver Query
Before the component can be run, you need to name the Target Table. This is the name of the new table which will be created to write the data into in Redshift. Also, an S3 Staging Area must be specified. This is an S3 bucket that is used to temporarily store the results of the query before it is loaded into Amazon Redshift.
This component also has a Limit property. You can use the Limit property to force an upper limit on the number of records returned. This can help improve the performance of your query.
You can run the Orchestration job, either manually or use the Scheduler, to query your data and bring it into Amazon Redshift.
Note – if when running this you encounter an error “SAP JCo library not found” you will need to manually add this into your Matillion instance. For further details please see here:
The SAP Netweaver Query component offers an “Advanced” mode instead of the default “Basic” mode.
In Advanced mode, you can write a SQL-like query over all the available tables in SAP. This is automatically translated into the correct API calls to retrieve the data requested.
Transforming the Data
Once you have brought the required data into Amazon Redshift from SAP Netweaver, you can use it in a Transformation job:
In this way, you can build out the rest of your downstream.