Provisioning a Cost-Efficient Redshift Cluster for Big Data | AWS Certified Big Data - Specialty Exam

Provisioning a Cost-Efficient Redshift Cluster for Big Data

Question

A company needs to start using a Redshift cluster.

They have 10 GB of data.

They don't know the cluster size that should be used for the cluster.

They have not done any testing to come to the recommended cluster size.

Which of the following would be the ideal way to provision the Redshift cluster in a cost-efficient manner?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer - B.

To be cost efficient you can start with a minimum number of nodes for the cluster.

The AWS Documentation mentions the following.

If your storage and performance needs change after you initially provision your cluster, you can resize your cluster.

You can scale the cluster in or out by adding or removing nodes.

Additionally, you can scale the cluster up or down by specifying a different node type.

For example, you can add more nodes, change node types, change a single-node cluster to a multinode cluster, or change a multinode cluster to a single-node cluster.

However, you must ensure that the resulting cluster is large enough to hold the data that you currently have or else the resize will fail.

Option A is incorrect since this would not be a cost-effective option.

Option C is incorrect since , even though possible , it's is sometimes difficult to come up with a benchmark, that is why you can have a flexible infrastructure setup with the service.

Option D is incorrect since we don't the requirement , we cannot plan to buy reserved instances at the start.

For more information on working with clusters, please refer to the below URL.

https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html

The ideal way to provision a Redshift cluster in a cost-efficient manner would be to start with a small number of cluster nodes.

Option A - Create the cluster with a large number of cluster nodes: This option may result in over-provisioning, which can increase costs unnecessarily. The company does not know the cluster size that should be used, so starting with a large number of cluster nodes would not be a cost-efficient approach.

Option B - Create the cluster with a small number of cluster nodes: This is the best option because it allows the company to start with a small cluster size and scale up as needed. Starting with a small number of nodes can save costs while still meeting the company's needs. This approach is more cost-efficient than starting with a large number of cluster nodes and potentially over-provisioning.

Option C - Insist on doing the testing to come up with the base requirement for the number of clusters: While testing is always a good practice, insisting on it may delay the deployment of the cluster. Additionally, testing may not be necessary for small datasets, such as the 10 GB of data that the company has. Starting with a small cluster size and scaling up as needed would be a more practical approach.

Option D - Plan on purchasing Reserved capacity: Reserved capacity is a good option for long-term usage. However, purchasing reserved capacity without knowing the actual cluster size needed can result in over-provisioning or under-provisioning, which can be costly. It is better to start with a small number of cluster nodes and scale up as needed before committing to purchasing reserved capacity.

In summary, the best approach for the company to provision the Redshift cluster in a cost-efficient manner would be to start with a small number of cluster nodes and scale up as needed.