elasticsearch

Creating Custom Elasticsearch Analyzers

May 6, 2018 by

In a previous post, you saw how to configure one of the built-in analyzers as well as a token filter. Now it’s time to see how we can build our own custom analyzer. We do that by defining which character filters, tokenizer, and token filters the analyzer should consist of, and potentially configuring them. PUT… read more

Configuring Elasticsearch Analyzers & Token Filters

May 6, 2018 by

Elasticsearch ships with a number of built-in analyzers and token filters, some of which can be configured through parameters. In the following example, I will configure the standard analyzer to remove stop words, which causes it to enable the stop token filter. I will create a new index for this purpose and define an analyzer… read more

Understanding the Inverted Index in Elasticsearch

May 5, 2018 by

If you read how analyzers work in Elasticsearch prior to reading this post, then you know how Elasticsearch analyzes text fields. Then you might wonder what actually happens with the results of the analysis process. They must end up being stored somewhere, right, because otherwise what’s the point? The results from the analysis are indeed… read more

Understanding Analysis in Elasticsearch (Analyzers)

May 5, 2018 by

In Elasticsearch, the values for text fields are analyzed when adding or updating documents. So what does it mean that text is analyzed? When indexing a document, its full text fields are run through an analysis process. By full-text fields, I am referring to fields of the type text, and not keyword fields, which are… read more

Understanding Replication in Elasticsearch

August 8, 2017 by

In order to understand how replication works in Elasticsearch, you should already understand how sharding works, so be sure to check that out first. Hardware can fail at any time, and software can be buggy at times. Let’s face it, sometimes things just stop working. The more hardware capacity you add, the higher the risk… read more

Understanding Sharding in Elasticsearch

August 8, 2017 by

Elasticsearch is extremely scalable due to its distributed architecture. One of the reasons this is the case, is due to something called sharding. If you have worked with other technologies such as relational databases before, then you may have heard of this term. Before getting into what sharding is, let’s first talk about why it… read more

Introduction to the Elasticsearch Architecture

August 8, 2017 by

This article is an introduction to the physical architecture of Elasticsearch, being how documents are distributed across virtual or physical machines and how machines work together to form what is known as a cluster. Nodes & Clusters To start things off, we will begin by talking about nodes and clusters, which are at the centre… read more

Aggregations

November 12, 2016 by
Part 35 of 35 in the Complete Guide to Elasticsearch series

Aggregations are a way of grouping and extracting statistics from your data. In case you are familiar with relational databases, you can think of this as the equivalent of SQL’s GROUP BY clause and aggregate functions such as SUM. Interestingly, Elasticsearch provides a rather powerful feature that allows you to execute searches and return hits… read more

Sorting Results

November 12, 2016 by
Part 34 of 35 in the Complete Guide to Elasticsearch series

In this article we will be taking a look at how to sort the search results. When retrieving documents from Elasticsearch, it is possible to sort the search results. If you are familiar with relational databases, then this is equivalent of the ORDER BY query clause. I will perform a search for the term pasta… read more

Pagination

November 12, 2016 by
Part 33 of 35 in the Complete Guide to Elasticsearch series

In this article, you will learn how to do pagination in Elasticsearch. In the previous article, I introduced the size parameter, which I will also be using to paginate through search results. While the size parameter specifies how many documents should be returned in the results, the from parameter specifies which document index to start… read more