Amazon SageMaker Ground Truth: Ensuring High-Quality Labeling with Automated Correction

Automated Correction for High-Quality Labeling in Amazon SageMaker Ground Truth

Question

You are building a data repository for your company's social media website that allows users to upload photos and videos to their personal stream.

These photos and videos need to be labelled and classified so your company can use them to build direct marketing capabilities into your application based on machine learning.

The direct marketing capability will be used to send targeted advertisements to users who have uploaded videos or photos of content related to a given product. You are using Amazon SageMaker Ground Truth to label your user's photos and videos.

Sometimes your Ground Truth human workers mislabel images and/or videos.

Which SageMaker Ground Truth feature helps you continue to get high-quality labelling in an automated way even when your workers occasionally mislabel?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer: D.

Option A is incorrect.

Ground Truth chaining labelling jobs allows you to reuse datasets from previous labelling jobs.

This feature would not help you address mislabeled images or videos.

Option B is incorrect.

The Ground Truth label verification and adjustment feature allows you to have workers verify and correct mislabeled labels.

This would help you correct mislabeled items, but it is not an automated process.

It is manual.

Option C is incorrect.

The Ground Truth batches for labelling tasks feature is used to send objects to your workers in batches.

This would not help you correct mislabeled objects.

Option D is correct.

The Ground Truth annotation consolidation feature allows you to combine the annotations of multiple workers to produce an automated probabilistic estimate of what the correct label should be.

Reference:

Please see the Amazon SageMaker developer guide titled Data Labeling, and the Amazon Machine Learning blog titled Use the wisdom of crowds with Amazon SageMaker Ground Truth to annotate data more accurately.

The correct answer to this question is B. Label verification and adjustment.

Amazon SageMaker Ground Truth is a fully managed data labeling service that makes it easy to build accurate training datasets for machine learning. It allows users to create and manage custom labeling workflows, and it also provides built-in workflows for common labeling tasks. Ground Truth can be used for a variety of labeling tasks, including image classification, object detection, semantic segmentation, and text classification.

In this scenario, the user is building a data repository for a social media website that allows users to upload photos and videos to their personal stream. The photos and videos need to be labeled and classified so that the company can use them to build direct marketing capabilities into the application based on machine learning.

To label the user's photos and videos, the user is using Amazon SageMaker Ground Truth. However, sometimes the Ground Truth human workers mislabel images and/or videos, which can affect the quality of the labels and therefore the accuracy of the machine learning models that use these labels.

To ensure high-quality labeling even when human workers occasionally mislabel images and videos, the user can use the label verification and adjustment feature in Amazon SageMaker Ground Truth. This feature allows the user to verify and adjust the labels generated by the human workers in an automated way.

When using the label verification and adjustment feature, the user can set up a percentage of the labels to be automatically verified by the system. The system compares the labels generated by the human workers with the correct labels and identifies any discrepancies. If the system detects a discrepancy, it automatically adjusts the label to the correct one.

In addition to verifying and adjusting labels, the label verification and adjustment feature also allows the user to control the quality of the labels by setting up validation rules. Validation rules are rules that specify the conditions that the labels must meet to be considered correct. If a label does not meet the validation rules, the system can automatically adjust the label or send it to a human worker for manual review.

In conclusion, the label verification and adjustment feature in Amazon SageMaker Ground Truth helps to ensure high-quality labeling even when human workers occasionally mislabel images and videos. By verifying and adjusting labels in an automated way, the user can improve the accuracy of the machine learning models that use these labels, which in turn can improve the effectiveness of the direct marketing capabilities in the application.