How to Output Transformed Customer Data to Salesforce Using Matillion ETL
Applying transformations to an organization’s data helps make that data useful for analytics. However, transforming data can mean that data stored in source systems, such as Salesforce, differs from the data stored in a data warehouse. It can be a challenge for data teams to keep data in sync.
To help with this problem, Matillion has released a Salesforce Output component in v1.46. With this new component, a user can use the Salesforce API to write the contents of a source table or view back into Salesforce objects. The user can employ operations such as: insert, update, delete, and upsert against enriched, cleansed data from their data warehouse to create a single source of truth.
In this example, we demonstrate how to use the Salesforce Output component with a Matillion ETL job that will pull customer data from a Google Sheet and output it into our Salesforce Leads table. We will also create a Transformation Job to introduce state names and company names that are respective to each customer before loading the final table to Salesforce.
Ingesting customer data
First, we create an Orchestration Job called salesforce_output_orch. Once we’ve switched into the job, we will search the components panel to find a Google Sheet Query component.
Clicking the component itself will expose the properties tab. The first properties authenticate your Google account. After the credentials have been configured, we can now search for our spreadsheet by writing its name in the Sheet Name parameter. In our case, this sheet is called ‘Random List’. Once we select the sheet, we can go into the data source parameter, which will list out the column names of our sheet. The column names are usually those in the first row of the spreadsheet. We will be selecting all columns.
To make sure that everything was correctly configured and the data is successfully being pulled, we can Sample the data within Matillion ETL. To do so, we go into the Sample tab and click on data.
As you can see, we’ve successfully pulled the data and transferred it into a temporary table called sc_customers.
Let’s now look for the S3 Load Generator component, which we will use to populate a table called us_state_lookup with US states data.
This table will contain two columns: one for state codes and one for state names.
Before moving on to the Transformation Job, we need to run this Orchestration Job, which will actually create and save these tables. To do this, we can right-click and select Run Job.
Transforming customer data
Now we can use our newly created tables in the Transformation Job, which we will call customer_transform.
This job will use the Current Customers table, the States Lookup table and the Company Names list to create our final table, which we will output into Salesforce. To do this, we first need to find our sc_customers table and our us_state_lookup table in the environments panel: environment_name >> schema_name >> table_name.
Once we’ve found these tables we can just drag them onto the canvas. Alternatively, we can use a Table Input component and fill out the properties in order to look for the specific tables.
Again, we can use the Sample tab to make sure that our tables are correctly populated and ready to use.
We will now be using a Fixed Flow component to generate a few lines of fixed input which will contain the customer’s id alongside the company name.
Sampling this data will give us the following:
Let’s now move on to the actual transformation. Find the Join component in the Components panel and drag it onto the canvas.
We will configure this component to create a left Join based around the state_codes and an inner Join based around the customer_id.
This will append the state names and company names to their corresponding customer. If you were to sample this, you should have two new columns: one called state_name, which will store the state names for each customer based on their state code; and one called company_name, which will contain the company name for each customer.
Now that we have transformed our initial table to contain state names and company names, we can use a Rewrite Table Component, which will save our joined table into a new table we will call final_customer_table.
Output data to Salesforce
Our final table is now ready to be loaded into Salesforce. We can go back to our Orchestration Job and complete this workflow.
Let’s start by using an AND component. This will ensure that both the Customer Data table and the States Lookup table are created and populated before any transformation takes place. Then we can drag our customer_transform job onto the orchestration canvas and connect it to the AND component.
Finally, we drag our long-awaited Salesforce Output component from the Components panel and attach it to our Transformation Job. After filling out the credentials, we need to choose two things: The Source table, which in this case is the final_customer_table generated from the transformation job; and the Target object, in this case the Lead table. Finally, we will column map between these two tables, like so:
a. | Final_customers_table | Leads table |
---|---|---|
b. | first name | FirstName |
c. | last name | LastName |
d. | ||
e. | industry | Industry |
f. | city | City |
g. | state name | State |
h. | company | Company |
Now when you run this job, the Salesforce table Lead will be populated with the final_customer_table data in the relevant columns within your Salesforce account.
That’s how to use the new Salesforce Output component, which will help you keep consistent data between your data warehouse and your Salesforce account. Be sure to check out the other new features and components in version 1.46, including Assert Components and new Shared Jobs to manage Amazon Redshift clusters.
The post How to Output Transformed Customer Data to Salesforce Using Matillion ETL appeared first on Matillion.