What is Amazon Redshift you ask ?

data warehouse

What is Amazon Redshift

Amazon Redshift is a Cloud based Data Warehouse service by Amazon Web Services (AWS).

There are two terminologies to pay attention to here – Cloud and Data Warehouse.

what is amazon redshift

Cloud, short for Cloud Computing, refers to computing resources provided by a third party. These computing resources can range from processing power, storage, applications to more complex SaaS, PaaS and IaaS. 

Traditionally, most data warehouses are hosted on premise; however, Amazon Redshift as a fully managed cloud service handles all aspects of scaling, capacity provisioning, cluster backup, patching and upgrading. That makes a huge difference !

The benefits of Cloud Computing are immense; however, for the sake of simplicity let’s just say it saves you a lot of money and heartburn.  

 

Now let’s look at what a Data Warehouse is !

 

A Data Warehouse is a repository to store large amounts of historical data, intended for generating reports and performing analytics. If the data is in a structured format or semi-structured format, then you can store it in Redshift.

Quite often, Amazon Redshift is confused for a nosql database. Redshift is not a nosql database. Matter of fact, Redshift is a relational database and uses a tailored version of PostgreSQL (open-source relational database) for Online Analytical Processing (OLAP) and to support BI applications. 

Now that we have a basic understanding of these terms, let’s read that line on Redshift again – Amazon Redshift is a Cloud based Data Warehouse service by AWS

4 Noteworthy facts about Amazon Redshift

1. Not just an optimized relational database

At its core, Amazon Redshift is made of clusters. A cluster in turn is made up of one or more nodes. These nodes can be categorized into leader nodes and compute nodes.

The leader node does the job of coordination and communication(engine), while the compute node does the heavy lifting (database).

amazon redshift architecture

2. But what about unstructured data ?

You already know Amazon Redshift can handle semi-structured data in addition to the standard structured data, which is great ! If you have a vast amount of unstructured data and want to generate analytics from it, Redshift has a solution for you.

Say hello to Amazon Redshift Spectrum!

Redshift Spectrum is a feature of Amazon Redshift which lets you query unstructured data stored in Amazon S3. You do not even have to load the data into the Redshift database. Matter of fact, you can even use Redshift Spectrum to query your structured and semi-structured data straight from Amazon S3.

Related: Learn how to create tables in Redshift using examples

3.Does Redshift play nice with other tools ?

Since Amazon Redshift is based on a relational database, it plays really well with other Databases, Data Integration, Reporting, Business Intelligence ( BI), Analytics and SQL Client tools.

Connection to the Redshift database can be established using ODBC or JDBC drivers.

4. So how does Amazon Redshift price work ?

Pricing with any AWS Service is based on a Pay-as-you-go model. Similar to your water or electricity bill, you only pay for services used for the duration of the usage, without the need to sign any long-term contracts.

AWS offers a lot of flexibility when it comes to Amazon Redshift price. The best approach to maximize these benefits is to think in terms of environments : Sandbox/Prototyping, Development, Testing, Staging and Production. 

infographic on amazon redshift price
  • Sandbox/ Prototyping environment : If you are playing around with the idea of Redshift, want to understand its features & functionality or build a quick prototype, consider the AWS Free Tier  trial version of AWS Redshift. With this option you get upto 750 hours of free usage per month, for two months.

  • Development/ Test/Staging environment(s) : These environments do not require to be up and operational 24/7. Your best option is to use On demand instance (Pay-as-you-go) pricing. With this option, you can pay by the hour and shut down instances when not in use, or when you do not need them any more, so you don’t get billed.

     

    If On-Demand instance is what you opt for, then you need to think of Amazon Redshift price in terms of Compute, Storage and Data Transfer as shown below. 

ComputeStorageData Transfer
Dense Compute (DC2)
Dense Storage (DS2)
RA3 with Redshift Managed Storage
Redshift Managed
Additional Backup
Redshift Spectrum
  • Production environment(s): You want these environments to be up and operational with very little downtime. Reserved Instances are the best for these environments. AWS lets you choose instances for a 1-3 year term, and oftentimes, they can end up being cheaper than the Pay-as-you-go option.

    An important point to remember, with AWS Reserved Instances, you are charged for the instances, for the term you signed up for, regardless of if you use them or not. The best part, the price includes two additional copies of your data, and AWS takes care of availability, backup, durability, monitoring, security and maintenance.

    For additional details on Redshift price for reserved nodes, click here.

By now you should have a high level understanding on how to approach Amazon Redshift pricing. Delving into the nuances of  the price details is not worth the time because it could change. The last thing you want, is to be stuck with an outdated assumption on pricing and its components. Instead, use the AWS Pricing Calculator for Amazon Redshift to determine the most up-to-date details on pricing. 

Redshift helpful links

Amazon Redshift Documentation

This is the latest version of Redshift Documentation

Get started with Amazon Redshift Spectrum

Learn how to create external tables, schema and query data using Spectrum

Interested in our services ?

email us at : info@obstkel.com

Copyright 2021 © OBSTKEL LLC. All rights Reserved