- Blog
- 11.01.2024
- Data Fundamentals, Product
Load data into Snowflake

Snowflake is known as a powerful columnar database that is optimal for data retrieval operations. In this article I'll describe some of the most efficient methods for loading data into Snowflake.
Data Manipulation Language (DML) statements - INSERT, UPDATE, and DELETE - are simple, but much too slow and cumbersome to deal with any kind of data volumes. Snowflake offers other, better strategies for bulk data loading:
- For simple automation, especially when scripting with local files, the SnowSQL CLI with the combination of a FILE FORMAT object and the COPY command
- To handle bulk data loads from cloud storage providers (Amazon S3, Azure Blob, or Google Cloud Storage), an External Stage defined using a STORAGE INTEGRATION object, and the COPY command again. Storage Integrations allow you to make a secure connection to cloud storage without involving secret keys or access tokens.
I will also talk about the options for loading data from services and formats (such as Microsoft Excel) where Snowflake has no inbuilt connector.
Let's dive in, starting with local files and the SnowSQL CLI.
Load data to Snowflake from a Local File
A great way to load from a local file is to use the SnowSQL CLI client. There are two steps:
- Upload the local file into a Snowflake stage
- Create a FILE FORMAT and issue a COPY command to load into a table
Uploading a local file into a Snowflake stage
Once logged into SnowSQL, use a PUT command to transfer the file from your local filesystem into a Snowflake stage.
In the following example, the file is uploaded into a subdirectory inside an internal User stage:
snowsql -c name-of-connection PUT file://myfile.csv @~/temp; LIST @~/temp;
Note that if you upload an uncompressed file in this way, SnowSQL will automatically gzip it on your behalf.
Using a FILE FORMAT and COPY
Snowflake uses FILE FORMAT objects to determine how to interpret the contents of a file to be loaded.
In the following example, the file format defines an optionally compressed CSV file, with a header line, and fields optionally enclosed in double quotes:
CREATE OR REPLACE FILE FORMAT FF_MYFILE TYPE = CSV COMPRESSION = AUTO RECORD_DELIMITER = '\n' FIELD_DELIMITER = ',' PARSE_HEADER = TRUE FIELD_OPTIONALLY_ENCLOSED_BY = '"';
Now that Snowflake has a way to parse the file, the last step is to create a new database table and transfer the contents into it. Notice how the table structure - including the column names and datatypes - is automatically inferred from the file.
CREATE OR REPLACE TABLE stg_mydata USING TEMPLATE
(SELECT ARRAY_AGG(OBJECT_CONSTRUCT(*))
WITHIN GROUP (ORDER BY ORDER_ID)
FROM TABLE(INFER_SCHEMA(LOCATION => '@~/temp/myfile.csv.gz',
FILE_FORMAT => 'FF_MYFILE')));
COPY INTO stg_mydata
FROM '@~/temp/myfile.csv.gz'
FILE_FORMAT = (FORMAT_NAME= 'FF_MYFILE')
MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE;
This process efficiently brings your local data to the Snowflake cloud for further integration and analysis.
To tidy up afterwards:
DROP FILE FORMAT FF_MYFILE; RM @~/temp/myfile.csv.gz;
Loading data into a Snowflake Table from CSV in Cloud Storage
Snowflake's recommended solution for bulk loading from cloud storage (Amazon S3, Azure Blob or Google Cloud Storage) is a combination of an External Stage with a STORAGE INTEGRATION object plus the COPY command.
Storage Integrations are top-level, named objects that let you connect to cloud storage securely without having to explicitly manage and share secret keys or access tokens. As a security best practice, Snowflake strongly advises customers to use Storage Integration objects for all cloud storage interactions.
Creating a Storage Integration is an administrative task that you'll only need to perform once. Many External Stages can make use of the same Storage Integration, so you will probably only need a few External Stages - enough to cover all the cloud storage areas you use for file exchange. For security, you should never have to use the CREDENTIALS clause when creating an External Stage.
Once your Storage Integration and External Stage are in place you will need to:
- Create a suitable target database table
- Specify the location and format of the source data files in the COPY command
For the examples below, I'll use the following simple target table:
CREATE OR REPLACE TABLE stg_table (ID NUMBER(3,0) NOT NULL, VALUE VARCHAR);
AWS S3 to Snowflake
Start by creating an S3 Storage Integration object that includes an IAM role with permissions to use S3.
CREATE STORAGE INTEGRATION my_s3_integration TYPE = EXTERNAL_STAGE STORAGE_PROVIDER = 'S3' STORAGE_AWS_ROLE_ARN = '<arn of your IAM role>';
In your AWS console, you must ensure that the chosen IAM role has sufficient privileges to use your S3 buckets.
Now that AWS S3 access is secure, create an external stage object using the Storage Integration. Note there is no need to supply AWS credentials when creating the Stage:
CREATE OR REPLACE STAGE my_s3_stage URL='s3://mybucket' STORAGE_INTEGRATION = my_s3_integration;
Lastly, issue a COPY command to transfer the data from S3 to your waiting table:
COPY INTO stg_table FROM @my_s3_stage/my-data-file.csv FILE_FORMAT = (TYPE = 'CSV' FIELD_OPTIONALLY_ENCLOSED_BY = '"') ON_ERROR = 'CONTINUE';
There's more information in Snowflake's guide to bulk loading from Amazon S3.
Azure Blob Storage to Snowflake
Start by creating an AZURE Storage Integration object that includes your Azure tenant ID.
CREATE STORAGE INTEGRATION my_azure_integration
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = 'AZURE'
AZURE_TENANT_ID = '<tenant_id>'
STORAGE_ALLOWED_LOCATIONS = ('azure://myaccount.blob.core.windows.net/mycontainer/path1/');
DESCRIBE STORAGE INTEGRATION my_azure_integration;
As another part of the initial configuration, you must then grant Snowflake access to the chosen storage locations, using a web browser with the URL returned by the DESCRIBE statement. You will also find the identity of your allocated Snowflake client application, which you must grant permission to your allowed storage locations.
You may also need to grant network access to Snowflake by altering your virtual network rules to Snowflake's own VNet subnet IDs. Snowflake can load from any container type except ADLS Gen1.
Now that Azure Blob Storage access is secure, create an external stage object using the Storage Integration.
CREATE OR REPLACE STAGE my_azure_stage URL='azure://myaccount.blob.core.windows.net/mycontainer' STORAGE_INTEGRATION = my_azure_integration;
Lastly, issue a COPY command to transfer the data from blob storage to your waiting table:
COPY INTO stg_table FROM @my_azure_stage/my-data-file.csv FILE_FORMAT = (TYPE = 'CSV') ON_ERROR = 'ABORT_STATEMENT';
There's more information in Snowflake's guide to bulk loading from Microsoft Azure.
Google Cloud Storage to Snowflake
Start by creating a GCS Storage Integration object that nominates the permitted buckets.
CREATE STORAGE INTEGRATION my_gcs_integration
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = 'GCS'
STORAGE_ALLOWED_LOCATIONS = ('gcs://<bucket>/<path>/');
DESCRIBE STORAGE INTEGRATION my_gcs_integration;
With Snowflake on GCP, a service account for cloud storage is created automatically for your Snowflake account. Its identity is returned as part of the DESCRIBE statement.
In your Google Cloud Platform Console, grant this service account the necessary access to your storage buckets.
Now that Google Cloud Storage access is secure create an external stage object using the Storage Integration.
CREATE OR REPLACE STAGE my_gcs_stage STORAGE_INTEGRATION = my_gcs_integration URL = 'gcs://my_bucket/';
Lastly, issue a COPY command to transfer the data from GCS to your waiting table:
COPY INTO stg_table FROM @my_gcs_stage/my_data.csv FILE_FORMAT = (type = 'CSV' field_delimiter = ',' skip_header = 1);
There's more information in Snowflake's guide to bulk loading from Google Cloud Storage.
Excel to Snowflake
Snowflake can not directly natively ingest Excel files, which means some intermediate steps are required. Broadly, there are two main options:
- Convert your Excel files to Comma-Separated Values (CSV) format, using a "Save As" or an "Export". Then use one of the techniques described earlier.
- Use a data productivity platform such as Matillion, which can both read from Excel and write to Snowflake in a single operation.
Here's a screenshot of how an Excel to Snowflake load looks in the Matillion Data Productivity Cloud.
Note that the same choice of two options applies to a wide variety of connectors beyond Excel: from SaaS systems to REST APIs. The Matillion Data Productivity Cloud has many other connectors, all of which follow the same low-code pattern shown in the screenshot.
Further reading
- Snowflake's reference guide Load Data into Snowflake
- The SnowSQL CLI
- Role Based Access Control (RBAC) in the Cloud
- Matillion connectors
- Launching a free trial of Matillion
Ian Funnell
Data Alchemist
Ian Funnell, Data Alchemist at Matillion, curates The Data Geek weekly newsletter and manages the Matillion Exchange.
Follow Ian on LinkedIn: https://www.linkedin.com/in/ianfunnell
Featured Resources
Big Data London 2025: Key Takeaways and Maia Highlights
There’s no doubt about it – Maia dominated at Big Data London. Over the two-day event, word spread quickly about Maia’s ...
BlogSay Hello to Ask Matillion, Your New AI Assistant for Product Answers
We’re excited to introduce a powerful new addition to the Matillion experience: Ask Matillion.
BlogRethinking Data Pipeline Pricing
Discover how value-based data pipeline pricing improves ROI, controls costs, and scales data processing without billing surprises.
Share: