Searching with Query DSL: Term Level Queries

Published on November 12, 2016 by

Now that we’ve got full text queries covered, let’s take a look at term level queries. As I mentioned in the introduction to searching in Elasticsearch, term level queries are used for exact matching of values. This means that search queries are not analyzed before matching as is the case for full text queries.

This also calls for a word of caution; string fields are analyzed by an analyzer by default, but search queries of the type term query are not analyzed. This causes some people headaches because they are not aware of this. If you add a document with a field’s value set to “pasta!”, then you might think that you can find that document by searching for that same value. This is, however, not the case, because the value has been analyzed, the exclamation point has been removed. Therefore the document’s value is no longer an exact match for the search query. Note that this example is true for the default analyzer, but can vary if one has chosen a different analyzer for a field. This is a potential problem for string fields, but term queries are not so frequently used for fields of this type. To avoid this problem, it’s possible to tell Elasticsearch to not analyze a field’s values, which means that they are added to the index as-is. To do this, simply set the index property of a field to not_analyzed.

With that out of the way, I will now show you the first type of term level queries, namely the term query. I hope this terminology doesn’t confuse you too much, but nevertheless, that’s what it’s called. The term query searches fields for exact values and only match if the given field contains exactly the value that was searched for.

GET /ecommerce/product/_search
{
  "query": {
    "term": {
      "status": "active"
    }
  }
}

This query searches for the exact term active for the status field.

Very similarly, the terms query matches documents that have fields that match any of the provided terms. As you might have noticed, the name of the query is in plural, meaning that we can search for multiple terms by passing them as an array for a field.

GET /ecommerce/product/_search
{
  "query": {
    "terms": {
      "status": [ "inactive", "paused" ]
    }
  }
}

In this example, we searched for products that have a status of either inactive or paused.

Another term query is the range query, which matches values within a certain range. It can be used with several data types such as dates, but I will show you an example that works with numbers. Specifically, I want to find products with 1-10 copies in the inventory.

GET /ecommerce/product/_search
{
  "query": {
    "range": {
      "quantity": {
        "gte": 1,
        "lte": 10
      }
    }
  }
}

Besides the gte and lte parameters, the lt and gt parameters also exist, which translate into “less than” and “greater than”, respectively.

There are a number of queries that are less commonly used than the ones I just showed you, so I am just going to briefly discuss them without showing you examples of how to use them. If you would like to see some examples, then please refer to the documentation.

The first one I’ll discuss is the prefix query, which matches fields that contain terms with a given prefix. For example, a prefix search on the name field would find a document with the name of “pasta”.

Another term level query is the wildcard query, which enables you to use wildcards within search queries. An asterisk (*) is used to match any character sequence, including the empty one. A question mark (?) is used to match any single character. Note that this query can be quite slow, and to prevent extremely slow queries, you are not allowed to place wildcards at the beginning of a term.

Similar to the wildcard query, is the regexp query, which matches based on regular expressions. This gives a lot of flexibility when searching for documents, but again, note that you should be careful in terms of performance. The performance of such a query heavily depends on the regular expression. The best practice is to use as long of a prefix before the regular expression as possible for the best performance. Again, remember that this query also searches the analyzed field values, which may be different than the values you added to the document.

There are two queries left that I just want to briefly mention. The exists query matches documents where a given field contains any non-null value, whereas the missing query matches documents where a given field is either missing or contains only NULL values.

Those were the most important things about term level queries.

Author avatar
Bo Andersen

About the Author

I am a back-end web developer with a passion for open source technologies. I have been a PHP developer for many years, and also have experience with Java and Spring Framework. I currently work full time as a lead developer. Apart from that, I also spend time on making online courses, so be sure to check those out!

0 comments on »Searching with Query DSL: Term Level Queries«

  1. kavya sahu

    nice info

Leave a Reply

Your e-mail address will not be published.