Azure Data Lake Store Gen1 Schema Plugin

Know External Data Schema

Question

You are working on Azure Data Lake Store Gen1

Suddenly, you realize the need to know the schema of the external data.

Which of the following plug-in would you use to know the external data schema?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E.

Correct Answer: E.

Option A is incorrect.

The ipv4_lookup plugin checks for an IPv4 value in a lookup table and returns the matched rows.

Option B is incorrect.

The mysql_request plugin transfers a SQL query to a MySQL Server network endpoint and returns the 1st row set in the result.

Option C is incorrect.

Pivot plug-in is used to rotate a table by changing the unique values from 1 column in the input table into a number of different columns in the output table and perform aggregations wherever needed on any remaining column values that are desired in the final output.

Option D is incorrect.

This plug-in is used to unpivot a wide table into a table with only three columns.

Option E is correct.

infer_storage_schema plug-in can be used to infer the schema of external data and return it as a CSL schema string.

References:

To know more about the external tables and plug-in, please visit the below-given link:

The correct answer to the question is E. infer_storage_schema.

Azure Data Lake Store Gen1 is a cloud-based storage service for big data analytics workloads. It allows storing and processing large amounts of data using distributed file systems, which can be easily integrated with other Azure services for big data processing and analysis.

External data in Azure Data Lake Store Gen1 refers to data that resides outside the Data Lake Store but can be accessed and processed by it. Examples of external data sources include Azure Blob Storage, Azure SQL Database, and Hadoop Distributed File System (HDFS).

When working with external data, it is essential to know the schema or structure of the data to ensure that the data can be processed correctly. Azure Data Lake Store Gen1 provides a plug-in called "infer_storage_schema" that can be used to infer the schema of external data.

The "infer_storage_schema" plug-in can analyze the data in the external data source and automatically determine its schema based on the data types, column names, and other metadata. This process can save time and effort that would otherwise be required to manually inspect and analyze the external data.

In summary, if you need to know the schema of external data in Azure Data Lake Store Gen1, you can use the "infer_storage_schema" plug-in to automatically infer the schema based on the data in the external data source.