There are a number of ways to load data into Amazon Redshift. However, when loading data from a table, the most efficient way of doing this, is to use the COPY command.
The reason why this is the most efficient method, is that the COPY command leverages the Amazon Redshift massively parallel processing architecture , allowing it to read and load data in parallel from a number of sources at once.
In this article we look at Amazon’s own suggestions around using the COPY command function. Then, using our own real-world experience of Amazon Redshift, we give our own advice on the matter.
For more information on the work we do around Amazon Redshift, visit our Amazon Redshift Partner page.
For amazon’s own best practice recommendations around using the COPY command to load data into Amazon Redshift, click here.
You can compress the files using gzip or lzop to save time uploading the files. COPY is then able to speed up the load process by uncompressing the files as they are read.
Matillion ETL for Amazon Redshift
Simplify data management and unlock Redshift’s potential with Matillion ETL for Amazon Redshift – an ETL/ELT tool built specifically for Amazon Redshift.
Matillion ETL for Amazon Redshift pushes the data transformation down to Redshift, meaning you can process millions of rows in seconds, with real-time in-job feedback and linear scalability.
Delivered as an AMI on the AWS Marketplace, Matillion ETL for Amazon Redshift can be up and running in a matter of minutes.
Right now you can get a FREE 14 DAY TRIAL of Matillion ETL for Amazon Redshift on the AWS Marketplace.
For more best-practice advice and information on optimising Amazon Redshift performance, download our free guide below