Question 49 of 56 from exam PMLE: Professional Machine Learning Engineer

Question 49 of 56 from exam PMLE: Professional Machine Learning Engineer

Question

Your team is working on an NLP research project to predict political affiliation of authors based on articles they have written.

You have a large training dataset that is structured like this:

AuthorA:Political Party A

Textal:
Texta2:

AuthorB:Political Party B

TextBL
TextB2:

AuthorC:Political Party B

AuthorD:Political Party A

TextDl:

[Sentenceall,
[Sentencea21,

[SentenceB11,
[SentenceB21,

[Sentencecl1,
[Sentencec21,

[SentenceD11,
[Sentencep21,

SentenceAl2,
SentenceA22,

SentenceBl2,
SentenceB22,

Sentencecl2,
Sentencec22,

SentenceD12,
SentenceD22,

Sentenceal3,
Sentencea23,

SentenceBl3,
SentenceB23,

SentenceCl3,
Sentencec23,

SentenceD13,
SentenceD23,

You followed the standard 80%-10%-10% data distribution across the training, testing, and evaluation subsets.

How should you distribute the training examples across the train-test-eval subsets while maintaining the 80-10-10 proportion?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

C.