Introduction to Elasticsearch: A Simple Guide to Key Terminologies and CRUD Operations

In our increasingly data-driven world, the efficient storage, retrieval, and analysis of information are paramount. Enter Elasticsearch: a powerful tool designed for handling vast datasets and executing rapid searches. In this guide, we’ll learn about the basic terminologies, key concepts, and how to perform CRUD operations in ElasticSearch.

What is Elastic Search?

Elasticsearch is a powerful, open-source search and analytics engine built on top of Apache Lucene. It’s designed to handle large-scale data processing, real-time search, and analytics with ease. Whether you’re dealing with log data, documents, geospatial data, or any other form of data, Elasticsearch can help you store, search, and analyze it effectively.

Elasticsearch is built to scale horizontally, meaning you can add more nodes to your Elasticsearch cluster as your data grows, ensuring consistent performance and reliability even as your workload increases without breaking a sweat.

Elasticsearch also provides a rich set of APIs and query languages, allowing you to perform complex searches, aggregations, and analytics on your data. Whether you need to perform full-text search, filter data based on certain criteria, or run aggregations to gain insights into your data, Elasticsearch has you covered.

In addition to its powerful search capabilities, Elasticsearch integrates seamlessly with other components of the Elastic Stack, including Logstash for data ingestion, Kibana for visualization and monitoring, and Beats for lightweight data shippers. This allows you to build end-to-end data pipelines for ingesting, processing, analyzing, and visualizing your data.

Related read: Dive into Elasticsearch: A Step-by-Step Guide to Getting Started

Basic Terminologies in Elasticsearch

Nodes: Nodes are individual servers that form a part of the Elasticsearch cluster. Each node stores data, participates in cluster management, and performs indexing and search operations.

Indices: Indices are collections of documents that share similar characteristics. Each index is divided into multiple shards for scalability and performance.

Shards: Shards are smaller subsets of an index that contain a portion of the index’s data. Elasticsearch distributes these shards across nodes in the cluster, allowing for parallel processing and improved performance.

Replica: Replicas are copies of primary shards in Elasticsearch. They serve as backups for high availability and fault tolerance purposes. Replicas also contribute to reading scalability by allowing search queries to be distributed across multiple copies of the data.

Writing Queries Using Kibana Dashboard

Kibana, part of the Elastic Stack, provides a user interface for querying Elasticsearch data and creating visualizations. Users can construct queries using the Elasticsearch Query DSL or the Lucene query syntax directly within the Kibana interface. Kibana also offers pre-built visualizations and dashboards for monitoring and analyzing data, making it easy to derive insights from Elasticsearch data.

Let’s see an example of how to create an index, and how to add mapping to an index using Kibana Dashboard.

PUT /index_name
{
"mappings": {
"properties": {
"name": {
"type": "text"
},
"phones": {
"type": "nested",
"properties":{
"number": {
"type": "keyword"
},
"lineType": {
"type": "text",
"index":false
}
}
}
}
},
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
}
}

This query is used to create an index in ElasticSearch where the index_name specifies the name of the index you want to create for eg if you want to create an index with the name employees we will replace index_name with employees after running this query our index will get created.

Deciding on the number of shards depends on the factors like size of the data, indexing, search throughput cluster size and nodes.

Deciding on the number of replicas in Elasticsearch involves considerations such as fault tolerance, search performance, resource utilization, and operational overhead.

Each document i.e. our employee has a ‘name’ which is a text. We have marked phones as nested as employees can have more than one phone.

Inside phones, we have defined two properties:

  • “number”: This field holds the phone number as a keyword. Keywords are exact values, suitable for filtering and sorting.
  • “lineType”: This field describes the line type of phone number, but it’s not indexed for full-text search. It’s only used for retrieval purposes.

Now let’s see an example of how to add a document to an elastic search.

POST /index_name/_doc
{
"name": "Test Employee",
"phones": [
{
"number": "1234567890",
"lineType": "Mobile"
},
{
"number": "9876543210",
"lineType": "Landline"
}
]
}

The above describes a document with the name ‘Test Employee’ which has two phone numbers and a type. After running this query document will be inserted in the elastic search. Each document has _id associated with it. In our case, a random ID will be generated. So to add the ID we want we can use the below query.

POST /index_name/_doc/EMP–01
{
"name": "Test Employee",
"phones": [
{
"number": "1234567890",
"lineType": "Mobile"
},
{
"number": "9876543210",
"lineType": "Landline"
}
]
}

Here EMP-01 Specifies the ID.

Take Your Data Management to the Next Level with Elasticsearch. Hire Our Developers!

Now let’s see an example of how to search a document in elastic search. To search the document by ID we can use the below query:

GET /index_name/_doc/id

Replace the id with the id value. We will get the document by ID. To search the document by name we can use the below query:

GET /index_name/_search
{
"query": {
"match": {
"name": "Test Employee"
}
}
}

To search the document inside the nested field we can use the below query:

GET /index_name/_search
{
"query": {
"nested": {
"path": "phones",
"query": {
"bool": {
"must": [
{ "match": { "phones.number": "1234567890" } }
]
}
}
}
}
}
  • “nested” is the query that is used for searching the nested fields.
  • “path” specifies the path to the nested field, in our case it is “phones”.
  • The “bool” query is used to combine multiple queries.
  • “must” is a clause indicating that the condition must match for the document to be considered a match.

Let’s see an example of how to update the document:

POST /index_name/_update/EMP-01
{
"doc": {
"name": "Update Employee",
"phones": [
{
"number": "1111111111",
"type": "Mobile"
},
{
"number": "1234554321",
"type": "Landline"
}
]
}
}
  • /index_name/_update/EMP-01 specifies the index name (index_name), the _update endpoint, and the document ID (EMP-01) that you want to update.
  • Within the request body, the “doc” key is used to specify the fields you want to update.
  • “name”, “Updated Employee” updates the value of the “name” field to “Updated Employee”.
  • The “phones” array is updated with new phone number objects. Each object contains the “number” and “type” fields.

To update specific fields only we can use the below query:

POST /index_name/_update/EMP-01
{
"doc": {
"name": "New Updated Employee"
}
}

Let’s see an example of how to delete the documents:

DELETE /index_name/_doc/EMP-01

This query will delete the document with ID EMP-01.

To delete a document by employee name we can use the below query:

POST /emp/_delete_by_query
{
"query": {
"match": {
"name": "Test Employee"
}
}
}

This query will delete the records with the name ‘Test Employee’.

To delete the document on matching the nested field value we can use the following query:

POST /index_name/_delete_by_query
{
"query": {
"nested": {
"path": "phones",
"query": {
"bool": {
"must": [
{ "match": { "phones.number": "1234567890" } }
]
}
}
}
}
}

This query will delete the documents where the phone number is ‘1234567890’.

coma

Conclusion

In this blog we have learned the importance of Elasticsearch, the terminologies used in ElasticSearch, and its ability to handle large volumes of data, real-time search, and integration with Kibana. Whether it’s creating indices, adding documents, searching, updating, or deleting data, Elasticsearch provides simple yet robust functionalities that cater to diverse data management needs.

In the next blog, we will dive deeper into the advanced searching mechanism provided by Elasticsearch.

Keep Reading

Keep Reading

  • Service
  • Career
  • Let's create something together!

  • We’re looking for the best. Are you in?