Amazon Redshift vs. On-Premise Data Warehousing

  • Richard Thelwell
  • October 5, 2015

amazon redshift on premise data warehouseAmazon Redshift has revolutionised data warehousing, and its popularity has made it the fastest growing service on the Amazon Web Services (AWS) marketplace. So why are so many companies turning to Redshift? In this article, we explore the benefits it offers, and how these stack up against the more traditional approach of on-premise data warehousing.

Redshift is faster

One of the biggest benefits of Amazon Redshift, as opposed to traditional data warehousing solutions, is the speed at which it can process data.

This competitive advantage comes in the ability to leverage the Massively Parallel Processing (MPP) capabilities of Redshift’s data warehouse architecture. This means that when processing data, the workload is distributed evenly across multiple nodes in your Redshift cluster, taking advantage of all available resources to increase efficiency. Through MPP, Amazon Redshift allows you to execute even the most complex queries faster.

amazon redshift faster
Amazon Redshift’s Massively Parallel Processing (MPP) makes data loading much faster than on-premise.

Furthermore, the columnar data storage for database tables helps improve query performance by drastically reducing the overall disk I/O requirements. Redshift can intuitively ignore any columns that are not referenced in the query, thereby reducing the amount of data you need to load from disk. This consequently slashes the amount of time it takes to perform a query when compared with row-stores.

Redshift is low cost

So Redshift is much faster than on-premise data warehousing – great!

However, as with most comparisons, what it really comes down to more often than not, is how much it is all going to cost.

With Amazon Redshift being based in the Cloud, you don’t have to worry about the cost of setting up your data warehouse. This means that, unlike on-premise solutions, there are no heavy upfront costs involved and no ongoing hardware or maintenance costs.

amazon redshift low cost
On-demand pricing makes Amazon Redshift a much more low cost alternative to on-premise data warehousing.

On-demand pricing means that you only pay for the resources you provision. You are not tied down to lengthy contract commitments and can stop using the service whenever you like, without large sunk costs being incurred.

This on-demand pricing starts at as low as $0.25 an hour for a 160GB DC1.Large node or $0.85 an hour for a larger 2TB version. And with reserved instance pricing offering even further reduced rates, it is estimated that Amazon Redshift can come in at about one-tenth of the cost of traditional data warehousing.

Redshift is scalable

On-premise data warehousing solutions can often prove very restrictive when it comes to scaling your data warehouse up or down. This can be both a cost- and a resource-intensive procedure.

Amazon Redshift, on the other hand, is much more nimble. Designed to scale effortlessly with the growth of your business, Redshift allows you to respond instantly to changes in your performance or capacity requirements.

amazon redshift scalable
Amazon Redshift can easily scale with the growth of your business.

In just a few simple clicks, you can easily scale the number or type of nodes in your Cloud data warehouse through the AWS management console. You can start with a 160GB DC1.Large node and scale your way up all the way to a petabyte or more of compressed user data using 16TB DS2.8XLarge nodes.

This scalability is the reason so many start-ups worldwide choose to run their business on AWS, and that was exactly the case for app-based taxi booking company, Hailo. When discussing why Hailo chose Redshift over on-premise data warehousing, Platform Automation Lead Boyan Dimitrov explained that, ‘Scalability was the main driver really – it’s great for start-ups, and great for start-ups that expect to grow rapidly’.

Redshift is secure

So with Amazon Redshift offering a faster, cheaper, and more scalable alternative, why are companies still choosing on-premise data warehousing?

The answer, in many cases, is the underlying security concerns that some businesses still have when it comes to data management in the Cloud. However, there are a number of robust security measures in place to ensure your data is protected.

amazon redshift security
If you’re worried about security – there really isn’t anything to worry about.

Amazon Redshift uses industry-standard encryption techniques to keep your data secure, both in transit, and at rest. Redshift supports SSL – enabled connections between client applications and your Redshift data warehouse in order to keep your data secure in transit. Furthermore, Amazon Redshift uses hardware-accelerated AES – 256 to encrypt data at rest.

For best-practice advice and information on optimizing Amazon Redshift performance, download our free guide below