Amazon Redshift is a Cloud based Data Warehouse service by Amazon Web Services (AWS).
There are two terminologies to pay attention to here – Cloud and Data Warehouse.
Cloud, short for Cloud Computing, refers to computing resources provided by a third party. These computing resources can range from processing power, storage, applications to more complex SaaS, PaaS and IaaS.
Traditionally, most data warehouses are hosted on premise; however, Amazon Redshift as a fully managed cloud service handles all aspects of scaling, capacity provisioning, cluster backup, patching and upgrading. That makes a huge difference !
The benefits of Cloud Computing are immense; however, for the sake of simplicity let’s just say it saves you a lot of money and heartburn.
Now let’s look at what a Data Warehouse is !
A Data Warehouse is a repository to store large amounts of historical data, intended for generating reports and performing analytics. If the data is in a structured format or semi-structured format, then you can store it in Redshift.
Quite often, Amazon Redshift is confused for a nosql database. Redshift is not a nosql database. Matter of fact, Redshift is a relational database and uses a tailored version of PostgreSQL (open-source relational database) for Online Analytical Processing (OLAP) and to support BI applications.
Now that we have a basic understanding of these terms, let’s read that line on Redshift again – “Amazon Redshift is a Cloud based Data Warehouse service by AWS”
At its core, Amazon Redshift is made of clusters. A cluster in turn is made up of one or more nodes. These nodes can be categorized into leader nodes and compute nodes.
The leader node does the job of coordination and communication(engine), while the compute node does the heavy lifting (database).
You already know Amazon Redshift can handle semi-structured data in addition to the standard structured data, which is great ! If you have a vast amount of unstructured data and want to generate analytics from it, Redshift has a solution for you.
Say hello to Amazon Redshift Spectrum!
Redshift Spectrum is a feature of Amazon Redshift which lets you query unstructured data stored in Amazon S3. You do not even have to load the data into the Redshift database. Matter of fact, you can even use Redshift Spectrum to query your structured and semi-structured data straight from Amazon S3.
Since Amazon Redshift is based on a relational database, it plays really well with other Databases, Data Integration, Reporting, Business Intelligence ( BI), Analytics and SQL Client tools.
Connection to the Redshift database can be established using ODBC or JDBC drivers.
Pricing with any AWS Service is based on a Pay-as-you-go model. Similar to your water or electricity bill, you only pay for services used for the duration of the usage, without the need to sign any long-term contracts.
AWS offers a lot of flexibility when it comes to Amazon Redshift price. The best approach to maximize these benefits is to think in terms of environments : Sandbox/Prototyping, Development, Testing, Staging and Production.
If On-Demand instance is what you opt for, then you need to think of Amazon Redshift price in terms of Compute, Storage and Data Transfer as shown below.
|Dense Compute (DC2)|
Dense Storage (DS2)
RA3 with Redshift Managed Storage
An important point to remember, with AWS Reserved Instances, you are charged for the instances, for the term you signed up for, regardless of if you use them or not. The best part, the price includes two additional copies of your data, and AWS takes care of availability, backup, durability, monitoring, security and maintenance.
For additional details on Redshift price for reserved nodes, click here.
By now you should have a high level understanding on how to approach Amazon Redshift pricing. Delving into the nuances of the price details is not worth the time because it could change. The last thing you want, is to be stuck with an outdated assumption on pricing and its components. Instead, use the AWS Pricing Calculator for Amazon Redshift to determine the most up-to-date details on pricing.