- Blog
- 10.02.2017
- Data Fundamentals, Dev Dialogues
Using the Twitter Query Component in Matillion ETL for BigQuery

Video
Watch our tutorial video for a demonstration on how to set up and use the Twitter Query component in Matillion ETL for BigQuery. https://youtu.be/-qSn_SuCjKcData extraction architecture
Similar to most of Matillion’s Load/Unload components, acquiring data from Twitter is a two stage process.
Data Orchestration
The Twitter Query is a data Orchestration component. To use the Twitter Query component you need to create an Orchestration job and edit it. Locate the Twitter Query in the component list and drag it onto the Orchestration job canvas to edit it.
- Authentication - set this to the name of the OAuth credentials you intend to use
- Data Source - choose a table or view from those available in the Twitter data model. There are about 20 of the main elements in the Twitter data model, including Tweets, Direct Messages, Followers and Lists
- Data Selection - choose the columns, depending on which Data Source you’re using
- Data Source Filter - you should normally enter some filters to restrict the data of interest
- Target Table - choose a name for the new BigQuery table. Note this is a “staging” table, so you’ll need to move the data on after loading (see the next section
- Cloud Storage Staging Area - choose one of your existing buckets. It will be used temporarily during the bulk load (stage 2, as mentioned above)

Running the Extract and Load
Matillion has a built-in scheduler, which you can use to run jobs automatically. During testing and development, however, you’ll probably just want to run it interactively with a right-click on the canvas.
Staging and Transformations
The Target Table property you chose is the name of a BigQuery “staging” table. That means it will be recreated every time the Twitter Query component runs. If the table already exists, it will be silently dropped first. It’s deliberately designed this way so you can do incremental loads and take advantage of BigQuery’s fast bulk loader. However, it does mean that you need to copy the newly-loaded data into a more permanent location after every load. The usual pattern is to call a new Transformation job immediately after the Load.

Useful Links
Twitter Query Component in Matillion ETL for BigQuery Component Data Model OAuth Set Up Integration information VideoBegin your data journey
Want to try the Twitter Query component in Matillion ETL for BigQuery? Request a free demo now, or launch on the Google Cloud Launcher.
Ian Funnell
Manager of Developer Relations
Featured Resources
News
Matillion Adds AI Power to Pipelines with Amazon Bedrock
Data Productivity Cloud adds Amazon Bedrock to no-code generative ...
BlogData Mesh vs. Data Fabric: Which Approach Is Right for Your Organization? Part 3
In our recent exploration, we've thoroughly analyzed two key ...
eBooks10 Best Practices for Maintaining Data Pipelines
Mastering Data Pipeline Maintenance: A Comprehensive GuideBeyond ...
Share: