Using the Google Drive Table Component in Matillion ETL for BigQuery

Using the Google Drive Table Component in Matillion ETL for BigQuery

Matillion uses the Extract-Load-Transform (ELT) approach to delivering quick results for a wide range of data processing purposes: everything from customer behavior analytics, financial analysis, and even reducing the cost of synthesizing DNA.

The Google Drive Table component presents an easy-to-use graphical interface, enabling you to pull data from your files stored on Google Drive directly into BigQuery. Customers are using this to pull user maintained data from their Google Drive into BigQuery, to enhance other data from different sources.

The connector is completely self-contained: no additional software installation is required. It’s within the scope of an ordinary Matillion license, so there is no additional cost for using the features.

In this example, we will pull some data from a ‘carriers’ Google Sheet saved on our Google Drive:

 

Google Drive Table Component in Matillion ETL for BigQuery - gsheet

 

New Table Name

This is the name of the new table which Matillion will create in BigQuery to load the contents of the file into:

 

Google Drive Table Component in Matillion ETL for BigQuery - New table

 

You have the options to specify whether the top row of data is the header and as such contains the column names for the new table your creating. You can also specify the Cell Range to limit the data that you push through.

Table Metadata

Now you will need to give the metadata for the new linked table you are creating.

 

Google Drive Table Component in Matillion ETL for BigQuery - Metadata

 

Create/Replace

This is the strategy for Creating the linked table. We recommend you use ‘Replace’ unless you are certain the component will only be run once:

 

Google Drive Table Component in Matillion ETL for BigQuery - create replace

 

Google Drive URL

This is the file URL of the Google Sheet on Google Drive that you will use to create the table in BigQuery. Right click on your file on Google Drive and click “Get shareable link”.

 

Google Drive Table Component in Matillion ETL for BigQuery - google drive url

 

This will copy the link into the clipboard which you can then easily paste directly into the component:

 

Google Drive Table Component in Matillion ETL for BigQuery - edit properties

 

File Format

Select the format of the file on Google Drive. Matillion supports the below formats out of the box:

 

Google Drive Table Component in Matillion ETL for BigQuery - file format

 

Number of Errors Allowed

By default Matillion will NOT create the linked table if it finds an error row in the file. However, you can set the maximum number of errors allowed in the file before the component errors

Header Rows to Skip

Set this to avoid loading any headers into the linked table.

Running the Google Drive TableComponent in Matillion ETL for BigQuery

You can run the Orchestration job manually or using the Scheduler. This will query your data once or regularly, respectively, and bring it into BigQuery.

 

Google Drive Table Component in Matillion ETL for BigQuery - run component

 

Transforming the Data

Once you have brought the required Google Drive data into BigQuery, you can use it in a Transformation job, perhaps to enhance existing data:

 

Google Drive Table Component in Matillion ETL for BigQuery - transformation

 

In this way, you can build out the rest of your downstream transformations and analysis, taking advantage of BigQuery’s power and scalability.

Useful Resources

Google Drive Table component

 

For more tips on loading your data into BigQuery download our comprehensive 60+ page guide on Optimizing Google BigQuery.

Matillion-Optimizing Google BigQuery - Email banner