Elasticsearch Architecture VII: How Elasticsearch reads data?

Mahmoud Yasser
Dev Genius
Published in
3 min readJun 2, 2023

--

Hello there! In our previous conversation, we briefly discussed the concept of routing in Elasticsearch. Now, let’s go on a journey to learn more about how Elasticsearch handles data reading. By the end of this article, you will have an in-depth understanding of routing, Adaptive Replica Selection (ARS), and the overall data reading workflow in Elasticsearch.

Routing is essential in this process, but there are other factors to consider. Please keep in mind that our talk will mostly centre on reading a single document, and we’ll go into search queries in further depth when the time comes.

To begin, when a read request is made, it is delivered to a specific node. This node, known as the coordinating node, which is in charge of coordinating the request.

Now, you might be wondering what this coordination entails. Well, the first step involves determining the location of the document we want to retrieve. As we previously discussed, routing is utilized for this purpose.

You might indicated previously in our talk that routing assists in determining the shard where a certain document is kept. That is still true, but let’s get a little more precise. If the shard has been replicated, routing resolves to either a primary shard or a replication group. As previously noted, this replication occurs virtually invariably for a variety of reasons.

To ensure scalability, Elasticsearch fetches documents in a certain manner. All retrievals would end up on the same shard if it merely retrieved the document from the primary shard every time. However, as the workload grows, this strategy does not scale effectively. Elasticsearch uses a mechanism known as Adaptive Replica Selection (ARS) to address this.

ARS is critical to the retrieval process. It chooses the best shard copy depending on a variety of criteria. While I won’t go into detail about the evaluation formula right now, rest assured that Elasticsearch takes a comprehensive approach to determining the best shard copy.

--

--