Sharding Strategy for Efficient Querying of Orders Placed in a Specific Month

Efficient Sharding Strategy for Querying Orders by Month

Question

You have an application that runs frequent queries to identify all the orders placed in a specific month.

This process can be run more efficiently if all orders for a month are stored in date and time order in the same shard.

Which Sharding strategy must be used here?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Correct Answer: C.

The question says that if all orders for a month are stored in date and time order in the same shard, it will increase the query.

This is the Range strategy.

The related items are put together in the same shard and orders by a Shard key.

Also, the shard keys will be sequential.

‘Sharding logic:

Map orders for October to shard A.
Application ‘Map orders for November to shard B Application
instance ‘Map orders for December to shard C instance

‘Map orders for .. to shard N

Query: Find
cre
meee EE

\
“hE.

Shard B Shard C Shard N

Query: Find
orders placed
in December

stored in date/time
sequence in a shard

Image source: Microsoft Documentation.

Option A is incorrect: This strategy actually works using routes.

When data needs to be located, it then routes the request to the particular shard where it is located.

This routing is supported by a shard key.

Option B is incorrect: This is the correct strategy to be used in this scenario.

Option C is correct: This strategy is used to make shards get equal size and load.

The most appropriate sharding strategy for storing orders in date and time order in the same shard would be Range Sharding. Range Sharding is a sharding strategy where data is partitioned based on a range of values, such as a range of dates or time stamps in this case.

With range sharding, data can be stored in contiguous ranges based on a specified key range. In this scenario, the key range could be the date or time stamp of the order. Each shard would then store a specific range of dates or time stamps. This ensures that all orders for a specific month are stored in the same shard and in date and time order.

Lookup sharding is a sharding strategy where data is partitioned based on a lookup table. This is typically used for smaller datasets where a single lookup table can be used to map data to its corresponding shard. Hash sharding is a sharding strategy where data is partitioned based on a hash function. This is useful for evenly distributing data across multiple shards.

In summary, for storing orders in date and time order in the same shard, Range Sharding is the most appropriate sharding strategy.