External Tables in DP-203 Exam | Microsoft Azure Data Engineering

External Tables in DP-203 Exam

Question

There are two types of external tables - Hadoop and Native external tables that can be used to read and export data depending upon the type and format of the external data source.

Which of the following statements are true about these external tables? (Select all that are applicable)

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E. F. G.

Correct Answers: A, D and E

The below table highlights the major differences between Hadoop external table and the Native external table:

External table
type

Dedicated SQL
pool
Serverless SQL

pool

Supported
formats

Folder partition
elimination

Custom format
for location

Recursive folder
scan

Storage filter
pushdown

Storage
authentication

Hadoop

Available

Not available

Delimited/CSV, Parquet, ORC, Hive

RC, and RC

No

Yes

No

No

Storage Access Key(SAK), AAD
passthrough, Managed identity,
Custom application Azure AD
identity

Native

Parquet tables are available in gated preview - contact your Microsoft
Technical Account Manager or Cloud Solution Architect to check if you
can add your dedicated SQL pool to the gated preview.

Available

Serverless SQL pool: Delimited/CSV, Parquet, and Delta Lake(preview)
Dedicated SQL pool: Parquet

Only for partitioned tables synchronized from Apache Spark pools in
Synapse workspace to serverless SQL pools

Yes, using wildcards like /year=*/month=*/day=*
Only in serverless SQL pools when specified /** at the end of the
location path

Yes in serverless SQL pool. For the string pushdown, you need to use
Latin1_General_100_BIN2_UTF8 collation on the VARCHAR columns.

Shared Access Signature(SAS), AAD passthrough, Managed identity

Option A is correct.

Hadoop external tables are available only in dedicated SQL pools, but not in serverless SQL pools.

Option B is incorrect.

Hadoop external tables are only available in dedicated SQL pools, but not in serverless SQL pools.

Option C is incorrect.

Native external tables are available in serverless SQL pools, and also available in dedicated Synapse SQL pools but only in gated preview.

Option D is correct.

Native external tables are available in serverless SQL pools, but in Synapse SQL pools, they are only in gated preview.

Option E is correct.

The files having the name started with a period (.) or an underline (_) are skipped by both Hadoop as well as native external tables while reading the data.

Option F is incorrect.

Both types of files having the name started with a period (.) or an underline (_) are skipped by both types of external tables.

Option G is incorrect.

Hadoop external tables skip the files having the name started with a period (.) or an underline (_).

To know more about using external tables with Synapse SQL, please visit the below-given link:

External tables are tables that store data in an external data source such as Azure Data Lake Storage or Azure Blob Storage. They can be used to read and export data from these data sources. There are two types of external tables in Azure Synapse Analytics - Hadoop external tables and Native external tables.

A. Hadoop external tables are only available in dedicated SQL pools, but not in serverless SQL pools. This statement is true. Hadoop external tables require a dedicated SQL pool because they use a component called PolyBase, which is not available in serverless SQL pools.

B. Hadoop external tables are only available in serverless SQL pools, but not in dedicated SQL pools. This statement is false. As mentioned earlier, Hadoop external tables require a dedicated SQL pool and are not available in serverless SQL pools.

C. Native external tables are available in serverless SQL pools but are not available in dedicated Synapse SQL pools. This statement is false. Native external tables are available in both serverless SQL pools and dedicated SQL pools.

D. Native external tables are available in serverless SQL pools, but in Synapse SQL pools, they are only in gated preview. This statement is true. Native external tables are available in serverless SQL pools, but in dedicated SQL pools, they are in gated preview, which means they are in a limited availability state and require additional configuration to use.

E. The files having the name started with a period (.) or an underline () are skipped by both Hadoop as well as native external tables while reading the data. This statement is true. Files starting with a period (.) or an underline () are considered hidden files in many operating systems, and Hadoop and Native external tables skip them while reading data.

F. Only the files having the name started with underline (__), not period() are skipped by both external tables. This statement is false. Both Hadoop and Native external tables skip files starting with either a period (.) or an underline ().

G. Hadoop considers all files and skips no file irrespective of its name. This statement is false. Hadoop external tables skip files starting with a period (.) or an underline (_), just like Native external tables.

In summary, Hadoop external tables require a dedicated SQL pool, Native external tables are available in both serverless and dedicated SQL pools, and both external table types skip files starting with a period (.) or an underline (_).