Monitoring Azure Stream Analytics Job: Possible Causes of High Watermark Delay

Reasons for High Watermark Delay in Azure Stream Analytics Job

Question

You are monitoring an Azure Stream Analytics job by using metrics in Azure.

You discover that during the last 12 hours, the average watermark delay is consistently greater than the configured late arrival tolerance.

What is a possible cause of this behavior?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

A

https://azure.microsoft.com/en-us/blog/new-metric-in-azure-stream-analytics-tracks-latency-of-your-streaming-pipeline/

In Azure Stream Analytics, the watermark is an internal time that represents the maximum event time processed by the job. The watermark delay is the difference between the current system time and the watermark. If the watermark delay is consistently greater than the configured late arrival tolerance, it indicates that the job is processing events with a delay that is beyond what the job is configured to handle.

Based on the options provided, the most likely cause of this behavior is option C - Events whose application timestamp is earlier than their arrival time by more than five minutes arrive as inputs. This means that events are arriving with timestamps that are earlier than their actual arrival time, indicating a delay in processing. The late arrival tolerance determines how long the job can wait for late-arriving events to arrive, and if the delay is greater than the configured tolerance, the late-arriving events may be dropped.

Option A - The job lacks the resources to process the volume of incoming data - is also a possible cause of the issue. If the job is not allocated sufficient resources, it may not be able to process the incoming data in a timely manner, leading to delays and a larger watermark delay.

Option B - The late arrival policy causes events to be dropped - is not a likely cause of the issue, as the late arrival policy is designed to handle delayed events and is not expected to cause events to be dropped unless the delay is too long.

Option D - There are errors in the input data - could be a cause of the issue, but it is less likely than the other options, as errors in the input data would typically result in dropped events or processing errors rather than a consistently high watermark delay.