AWS Big Data Querying Tool for Interactive Analysis | Website Name

Interactive Querying Tool for Analyzing CSV, JSON, and Columnar Data Formats | AWS Certified Big Data - Specialty Exam

Question

FlexiToner uses AWS to query 10 years' worth of historical data and get results in moments, with the flexibility to explore data for deeper insights.

Movable Ink provides real-time personalization of marketing emails based on a wide range of user, device, and contextual data, driving higher response rates and better customer experiences. The company is looking at data scientists to access interactive query service to access the data (structured, semi-structured, unstructured) loaded in S3 and recommend and provide insights to improve results of customer marketing campaign.

The database grows by up to 100 GB per day.

To reduce time to insight, optimize costs, and increase flexibility for its analysis, which tool can provide interactive querying capability for the datasets available in CSV, JSON, or columnar data formats such as Apache Parquet and Apache ORC OOTB with S3? Select 1 option.

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer: A.

Option A is correct - Athena helps you analyze unstructured, semi-structured, and structured data stored in Amazon S3

Examples include CSV, JSON, or columnar data formats such as Apache Parquet and Apache ORC.

You can use Athena to run ad-hoc queries using ANSI SQL, without the need to aggregate or load the data into Athena.

https://docs.aws.amazon.com/athena/latest/ug/when-should-i-use-ate.html

Option B is incorrect - Amazon QuickSight is a business analytics service you can use to build visualizations, perform ad hoc analysis, and get business insights from your data.

It can automatically discover AWS data sources and also works with your data sources.

Amazon QuickSight enables organizations to scale to hundreds of thousands of users, and delivers responsive performance by using a robust in-memory engine (SPICE).

https://docs.aws.amazon.com/quicksight/latest/user/welcome.html

Option C is incorrect - AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics.

Point AWS Glue to your data stored on AWS, and AWS Glue discovers your data and stores the associated metadata (e.g.

table definition and schema) in the AWS Glue Data Catalog.

Once cataloged, your data is immediately searchable, queryable, and available for ETL.

Option D is incorrect - Amazon Machine Learning (Amazon ML) is a robust, cloud-based service that makes it easy for developers of all skill levels to use machine learning technology.

Amazon ML provides visualization tools and wizards that guide you through the process of creating machine learning (ML) models without having to learn complex ML algorithms and technology.

https://docs.aws.amazon.com/machine-learning/latest/dg/what-is-amazon-machine-learning.html

The most appropriate tool to provide interactive querying capability for the datasets available in CSV, JSON, or columnar data formats such as Apache Parquet and Apache ORC OOTB with S3 is Amazon Athena (Option A).

Amazon Athena is a serverless interactive query service that makes it easy to analyze unstructured, semi-structured, and structured data stored in Amazon S3 using standard SQL. With Athena, you can run ad-hoc queries on data without having to aggregate or load the data into Athena. Athena also supports various data formats, including CSV, JSON, and columnar data formats such as Apache Parquet and Apache ORC. It is ideal for analyzing large and complex datasets that may require querying data across multiple data sources.

QuickSight (Option B) is a business intelligence (BI) service that provides visualization and reporting capabilities. While it can access data stored in Amazon S3, it is not primarily designed for interactive querying of large datasets.

AWS Glue (Option C) is an extract, transform, and load (ETL) service that can help prepare and transform data for analysis. While it can analyze unstructured, semi-structured, and structured data stored in Amazon S3, it is not designed for interactive querying of large datasets.

AWS ML (Option D) is a machine learning service that can be used to build predictive models using data stored in Amazon S3. While it can analyze data stored in Amazon S3, it is not designed for interactive querying of large datasets.

Therefore, the correct option is A, Amazon Athena.