- Blog
- 10.18.2017
- Data Fundamentals, Dev Dialogues
Using the Twitter Query Component in Matillion ETL for Snowflake

Video
Watch our tutorial video for a demonstration on how to set up and use the Twitter Query Component in Matillion ETL for Snowflake. https://youtu.be/u0qf5nJHLAcData extraction architecture
Similar to most of Matillion’s Load/Unload components, acquiring data from Twitter is a two stage process.
Data Orchestration
The Twitter Query is a data Orchestration component. To use the Twitter Query component you need to create an Orchestration job and edit it. Locate the Twitter Query in the component list and drag it onto the Orchestration job canvas to edit it.
- Authentication - set this to the name of the OAuth credentials you intend to use
- Data Source - choose a table or view from those available in the Twitter data model. There are about 20 of the main elements in the Twitter data model, including Tweets,
- Direct Messages, Followers and Lists
- Data Selection - choose the columns, depending on which Data Source you’re using
- Data Source Filter - you should normally enter some filters to restrict the data of interest
- Limit - it’s a good idea to set this to avoid rate limiting errors
- S3 Staging Area - choose one of your existing S3 buckets. It will be used temporarily during the bulk load (stage 2, as mentioned above)
- Target Table - choose a name for the new Snowflake table. Note this is a “staging” table, so you’ll need to move the data on after loading (see the next section)

Running the Extract and Load with the Twitter Query component in Matillion ETL for Snowflake
Matillion offers various ways to run Orchestration jobs, including its own built-in scheduler, which you can use to run jobs automatically. There is also an integration with SQS allowing you to synchronize Matillion jobs with an external scheduler. During testing and development, however, you’ll probably just want to run it interactively with a right-click on the canvas.
Staging and Transformations
The Target Table property you chose is the name of a Snowflake “staging” table. Hence, it will be recreated every time the Twitter Query component runs. If the table already exists, it will be silently dropped first. It’s deliberately designed this way, so you can do incremental loads and take advantage of Snowflake’s fast bulk loader. However, it does mean that you need to copy the newly-loaded data into a more permanent location after every load. The usual pattern is to call a new Transformation job immediately after the Load.




Useful Links
Twitter Query component in Matillion ETL for Snowflake Component Data Model OAuth Set Up Integration information VideoBegin your data journey
Want to try the Twitter Query component in Matillion ETL for Snowflake? Arrange a free 1-hour training session now, or start a free 14-day trial.
Ian Funnell
Manager of Developer Relations
Featured Resources
Blog
Data Mesh vs. Data Fabric: Which Approach Is Right for Your Organization? Part 3
In our recent exploration, we've thoroughly analyzed two key ...
eBooks10 Best Practices for Maintaining Data Pipelines
Mastering Data Pipeline Maintenance: A Comprehensive GuideBeyond ...
NewsMatillion Adds AI Power to Pipelines with Amazon Bedrock
Data Productivity Cloud adds Amazon Bedrock to no-code generative ...
Share: