Azure IoT Data Engineering: Calculate Hourly Temperature Differences in Solar Power Panels

Calculate Hourly Temperature Differences in Solar Power Panels

Question

Your company installs solar power panels which are integrated with IoT devices.

These devices send temperature data to the Azure IoT hub.

You have written the following query to find the difference in temperature readings every hour in each sensor.

SELECT PanelId,  growth = reading - XXXX (reading) OVER (PARTITION BY sensorId XXXX (hour, 1)) FROM input
Which of the following options can fill the unknown part respectively?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Correct Answer: B.

LAG is basically an analytic operator that allows us to look up a previous event part of an event stream.

It is majorly used in finding the growth or change or a particular variable, which is the temperature of the Panel.

As the name suggests, LIMIT DURATION will be used to restrict the interval or time considered while computing this query.

In total, LAG and LIMIT DURATION determine how far we have to do the lookup of history.

Option A is incorrect: Using LAG is correct here.

But when the use of WHEN is like adding a condition and not for a timeframe /interval, that makes the option a wrong choice.

Option B is correct: The use of LAG and LIMIT DURATION works perfectly in this case.

Option C is incorrect: LAST is used for the lookup of the most recent event.

But using LIMIT DURATION is correct.

Since the first condition is wrong, this option will not be the best choice.

Option D is incorrect: LAST and WHEN cannot be used here.

To know more, please refer to the docs below:

The given query is written in Transact-SQL (T-SQL) to find the difference in temperature readings every hour in each sensor. The query has a syntax error, as the XXXX part is unknown and needs to be filled.

To find the difference in temperature readings every hour in each sensor, we need to compare the current temperature reading with the previous reading that occurred one hour ago. For this purpose, we can use the LAG function in T-SQL, which provides access to a row at a specified physical offset prior to the current row within the same result set.

The correct option to fill the unknown part is A: LAG, WHEN.

The updated query using LAG function would be:

<pre>SELECT PanelId, growth = reading - LAG (reading, 1, 0) OVER (PARTITION BY sensorId ORDER BY hour) as previous_reading FROM input</pre>

Here, the LAG function is used to get the previous reading for each sensor ordered by hour. The WHEN keyword is not required in this case because we are using the default behavior of LAG, which returns NULL when there is no previous row available.

The second argument of the LAG function specifies the offset, which is 1 in this case, as we want to get the previous reading. The third argument specifies the default value to return when there is no previous row available, which is 0 in this case.

The PARTITION BY clause is used to partition the data by sensorId, which means that the LAG function will be applied separately for each sensor. The ORDER BY clause is used to order the data by hour, which means that the LAG function will get the previous reading for each sensor that occurred one hour ago.

Finally, we subtract the previous reading from the current reading to get the difference in temperature readings every hour in each sensor, which is stored in the growth column.

Therefore, the correct query using the LAG function with the WHEN keyword is:

<pre>SELECT PanelId, growth = reading - LAG (reading, 1, 0) OVER (PARTITION BY sensorId ORDER BY hour WHEN hour > 1 THEN 1 ELSE 0 END) FROM input</pre>

Here, the WHEN keyword is used to handle the case when there is no previous row available, which occurs for the first reading of each sensor. We use the WHEN keyword to check if the hour is greater than 1, which means that there is a previous reading available. If the hour is greater than 1, then we use an offset of 1 to get the previous reading. Otherwise, we use an offset of 0 and return the default value, which is 0.