Performing ETL Transformations on Data for Google Cloud | Answer for ACE: Associate Cloud Engineer Exam

ETL Transformations on Unstructured Data for Google Cloud Dataflow

Question

Your company has a large quantity of unstructured data in different file formats.

You want to perform ETL transformations on the data.

You need to make the data accessible on Google Cloud so it can be processed by a Dataflow job.

What should you do?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

B.

https://cloud.google.com/solutions/performing-etl-from-relational-database-into-bigquery

To make the data accessible on Google Cloud so that it can be processed by a Dataflow job, you should upload the data to Cloud Storage using the gsutil command line tool. Here is why:

A. Uploading the data to BigQuery using the bq command line tool may be a good option if you want to perform SQL queries on the data. However, this does not allow for ETL transformations, and BigQuery is not the appropriate storage option for unstructured data.

B. Cloud Storage is the correct storage option for unstructured data, and gsutil is a command-line tool for uploading data to Cloud Storage. Once the data is uploaded, you can use Cloud Dataflow to perform ETL transformations on the data.

C. Uploading the data into Cloud SQL using the import function in the console is not the best option for unstructured data. Cloud SQL is a fully-managed relational database service for structured data.

D. Cloud Spanner is a horizontally-scalable, globally-distributed, relational database service. It is designed for mission-critical OLTP applications. It is not the best option for unstructured data or ETL transformations.

In summary, uploading the unstructured data to Cloud Storage using the gsutil command line tool is the best option for making the data accessible on Google Cloud so it can be processed by a Dataflow job.