Introduction to Elastic Search Query DSL and Aggregations

In the previous blog, titled “Introduction to Elasticsearch: A Simple Guide to Key Terminologies and CRUD Operations” we covered topics such as creating an index with mapping, performing CRUD operations, and understanding key terminologies used in Elasticsearch. These foundational concepts provide readers with a solid understanding of how Elasticsearch works and how to interact effectively.

Building upon this knowledge, we will delve into the advanced search features provided by Elasticsearch in this blog. We will explore techniques for enhancing search accuracy, such as full-text search capabilities, data summarization, and analysis aggregations, geo-search functionality for location-based applications, multi-field and nested queries for handling complex data structures, and performance optimization strategies.

Before starting we will be using the employees as our index with the following properties. We will create a query based on the below properties.

{
"mappings": {
"properties": {
"name": {
"type": "text"
},
"department": {
"type": "keyword"
},
"age": {
"type": "integer"
},
"phone": {
"type": "nested",
"properties": {
"number": {
"type": "text"
}
}
},
"joining_date": {
"type": "date"
}
}
}
}

Query DSL

Query DSL (Domain Specific Language) in Elasticsearch is a powerful tool for constructing complex search queries to retrieve specific documents from an index. It offers a wide range of query types and parameters to customize search behavior according to your requirements. Let’s explore some common query types.

1. Match Query

The match query is used to perform a full-text search on fields. It analyzes the input text and retrieves documents containing any of the specified terms.

Example:

{
"query": {
"match": {
"name": "John Wick"
}
}
}

This query searches for documents where the “name” field contains the terms “John” and “Wick”.

2. Term Query

The term query is used for the exact matching of terms without analysis. It’s suitable for fields like keywords or exact values.

Example:

{
"query": {
"term": {
"department": "IT"
}
}
}

This query searches for documents where the “department” field exactly matches “IT”.

3. Range Query

The range query is used to search for documents within a specified range of values.

Example:

{
"query": {
"range": {
"age": {
"gte": 20,
"lte": 35
}
}
}
}

This query searches for documents where the “age” field is between 20 and 35(inclusive).

4. Bool Query

The bool query allows combining multiple queries using boolean logic (must, must_not, should).

Example:

{
"query": {
"bool": {
"must": {
"match": {
"name": "John Wick"
}
},
"must_not": {
"term": {
"department": "IT"
}
}
}
}
}

This query searches for documents containing “John Wick” in the “name” field and excludes documents with the “department” field set to “IT”.

5. Nested Query

The nested query is used to query nested objects within documents.

Example:

{
"query": {
"nested": {
"path": "phone",
"query": {
"match": {
"phone.number": "1234567890"
}
}
}
}
}

This query searches for documents with nested objects in the “phone” field containing the text “1234567890”.

These are just a few examples of the Query DSL in Elasticsearch. By combining and customizing these query types, you can construct powerful and precise search queries to suit your application’s needs.

Take Your Elasticsearch Project to the Next Level. Hire Our Developers!

Aggregations in Elasticsearch

Aggregations in Elasticsearch is a powerful tool for analyzing and summarizing data retrieved from a query. They allow you to perform calculations, statistics, and data manipulation on the search results to gain valuable insights. Here’s an explanation of common aggregation types along with examples:

1. Terms Aggregation

The term aggregation groups documents by the values of a specified field and provides counts for each unique value. It’s similar to the SQL “GROUP BY” clause.

Example:

{
"aggs": {
"top_departments": {
"terms": {
"field": "department.keyword",
"size": 5
}
}
}
}

Explanation:

  • In this example, the terms aggregation groups documents by the values of the “department.keyword” field.
  • It then returns the top 5 departments based on the count of documents in each department.

2. Date Histogram Aggregation

The date histogram aggregation groups documents into time intervals, such as days, weeks, or months, and provides counts for each interval.

Example:

{
"aggs": {
"joinings_over_time": {
"date_histogram": {
"field": "joining_date",
"interval": "month"
}
}
}
}

Explanation:

  • This aggregation groups documents by month based on the “joining_date” field.
  • It returns the count of documents for each month, creating a histogram of joinings over time.

3. Range Aggregation

The range aggregation allows you to group documents into specified value ranges and provides counts for each range.

Example:

{
"aggs": {
"age_ranges": {
"range": {
"field": "age",
"ranges": [
{ "from": 18, "to": 25 },
{ "from": 25, "to": 35 },
{ "from": 35, "to": 58 }
]
}
}
}
}

Explanation:

  • In this example, the range aggregation groups documents into age ranges.
  • It provides counts for documents falling within each specified age range.

4. Average Aggregation

The average aggregation calculates the average value of a specified numeric field across documents.

Example:

{
"aggs": {
"avg_age": {
"avg": {
"field": "age"
}
}
}
}

Explanation:

  • This aggregation calculates the average value of the “age” field across all documents.
  • It returns the average age of the employees in the dataset.

5. Max Aggregation

The max aggregation finds the maximum value of a specified field across documents.

Example:

{
"aggs": {
"max_age": {
"max": {
"field": "age"
}
}
}
}

Explanation:

  • In this example, the max aggregation finds the maximum value of the “age” field across all documents.
  • It returns the highest age among all items in the dataset

6. Min Aggregation

The min aggregation finds the minimum value of a specified field across documents.

Example:

{
"aggs": {
“min_age": {
"min": {
"field": "age"
}
}
}
}

Explanation:

  • In this example, the min aggregation finds the minimum value of the “age” field across all documents.
  • It returns the smallest age among all items in the dataset.

These aggregations are essential tools for performing analytics and gaining insights into your data stored in Elasticsearch. Depending on your use case, you can combine multiple aggregations to derive meaningful conclusions from your dataset.

coma

Conclusion

In this blog, we’ve explored the advanced search features provided by Elasticsearch, building upon the foundational knowledge covered in the previous blog. Through Query DSL, we’ve learned how to construct complex search queries using various query types like match, term, range, bool, and nested queries, enabling precise retrieval of documents based on specific criteria.

Furthermore, we’ve learned Elasticsearch aggregations, powerful tools for data analysis and summarization. We’ve discussed common aggregation types such as terms, date histogram, range, average, max, and min aggregations, showcasing their utility in deriving valuable insights from search results.

By mastering these advanced search capabilities, users can enhance search accuracy, analyze data effectively, and optimize performance, empowering them to build robust search applications tailored to their unique requirements. As Elasticsearch continues to evolve, leveraging these advanced features will be instrumental in unlocking the full potential of this versatile search and analytics engine.

Keep Reading

Keep Reading

  • Service
  • Career
  • Let's create something together!

  • We’re looking for the best. Are you in?