Django Elasticsearch: A Comprehensive Guide

What is Elasticsearch?

Elasticsearch is a powerful open-source search and analytics engine that makes data easy to explore. It powers search for many of the world’s largest organizations, including Wikipedia, Netflix, and The Guardian.

Elasticsearch is built on top of the Apache Lucene search library, offering powerful features like distributed search, real-time analytics, and support for multiple data types.

Despite its power, Elasticsearch is easy to use. It has a simple REST API that makes it easy to index and search data. It also has a web-based console that makes it easy to manage your cluster.

Advantages of Elasticsearch

✔️ Elasticsearch is a distributed, RESTful search and analytics engine built on top of Apache Lucene.

✔️ Elasticsearch is easy to use and scalable.

✔️ Elasticsearch is suitable for many use cases, from personal to enterprise search.

✔️ Elasticsearch has a rich set of features, including full-text search, aggregations, and geolocation.

✔️ Elasticsearch is open source and available under the Apache license.

Installing Elasticsearch

In this, we will understand the installation procedure of Elasticsearch in detail.

To install Elasticsearch on your local computer, you will have to follow the steps given below −

🟠 Step 1 − Check the version of Java installed on your computer. It should be Java 7 or higher. You can check by doing the following −

In Windows Operating System (OS) (using command prompt)−

> java -version

In UNIX OS (Using Terminal) −

$ echo $JAVA_HOME

🟠 Step 2 − Depending on your operating system, download Elasticsearch from www.elastic.co as mentioned below −

➡️ For Windows OS, download the ZIP file.

➡️ For UNIX OS, download the TAR file.

➡️ For Debian OS, download the DEB file.

➡️ For Red Hat and other Linux distributions, download the RPM file.

➡️ APT and Yum utilities can also be used to install Elasticsearch in many Linux distributions.

🟠 Step 3 − The installation process for Elasticsearch is simple and is described below for different OS −

➡️ Windows OS− Unzip the zip package and Elasticsearch is installed.

➡️ UNIX OS− Extract tar file in any location and Elasticsearch is installed.

➡️ Linux OS− For Linux refer to the link.

Django Elasticsearch DSL

Django Elasticsearch DSL is a package that allows the indexing of Django models in Elasticsearch. It is built as a thin wrapper around elasticsearch-dsl-py so you can use all the features developed by the elasticsearch-dsl-py team.

You can view the full documentation here.

Features

➡️ Based on elasticsearch-dsl-py you can make queries with the Search class.

➡️ Django signal receivers on save and delete for keeping Elasticsearch in sync.

➡️ Management commands for creating, deleting, rebuilding, and populating indices.

➡️ Elasticsearch auto mapping from Django models fields.

➡️ Complex field type support (ObjectField, NestedField).

➡️ Index fast using parallel indexing.

➡️ Requirements

  • Django >= 1.11
  • Python 2.7, 3.5, 3.6, 3.7

Install and Configure

Install Django Elasticsearch DSL:

pip install django-elasticsearch-dsl

Then add django_elasticsearch_dsl to the INSTALLED_APPS

You must define ELASTICSEARCH_DSL in your Django settings.

For example:

ELASTICSEARCH_DSL={
'default': {
'hosts': 'localhost:9200'
},
}

Declare Data to Index

Then for a model:

# models.py

class Car(models.Model):
name = models.CharField()
color = models.CharField()
manufacturer = models.ForeignKey('Manufacturer')

class Manufacturer(models.Model):
name = models.CharField()
country_code = models.CharField(max_length=2)
created = models.DateField()

class Ad(models.Model):
title = models.CharField()
description = models.TextField()
created = models.DateField(auto_now_add=True)
modified = models.DateField(auto_now=True)
url = models.URLField()
car = models.ForeignKey('Car', related_name='ads')

To make this model work with Elasticsearch, create a subclass of django_elasticsearch_dsl. Document, create a class Index inside the Document class to define your Elasticsearch indices, names, settings, etc, and at last register the class using the registry.register_document decorator. It required defining the Document class in documents.py in your app directory.

# documents.py

from django_elasticsearch_dsl import Document, fields
from .models import Car, Manufacturer, Ad

@registry.register_document
class CarDocument(Document):
manufacturer = fields.ObjectField(properties={
'name': fields.TextField(),
'country_code': fields.TextField(),
})
ads = fields.NestedField(properties={
'description': fields.TextField(analyzer=html_strip),
'title': fields.TextField(),
'pk': fields.IntegerField(),
})

class Index:
name = 'cars'

class Django:
model = Car
fields = [
'name',
'color',
]
related_models = [Manufacturer, Ad] # Optional: to ensure the Car will be re-saved when Manufacturer or Ad is updated

def get_queryset(self):
"""Not mandatory but to improve performance we can select related in one sql request"""
return super(CarDocument, self).get_queryset().select_related(
'manufacturer'
)

def get_instances_from_related(self, related_instance):
"""If related_models is set, define how to retrieve the Car instance(s) from the related model.
The related_models option should be used with caution because it can lead in the index
to the updating of a lot of items.
"""
if isinstance(related_instance, Manufacturer):
return related_instance.car_set.all()
elif isinstance(related_instance, Ad):
return related_instance.car

Populate

To create and populate the Elasticsearch index and mapping use the search_index command:

python manage.py search_index --rebuild

Searching

To get an elasticsearch-dsl-py Search instance, use:

s = CarDocument.search().filter("term", color="red")

# or

s = CarDocument.search().query("match", description="beautiful")

for hit in s:
print(
"Car name : {}, description {}".format(hit.name, hit.description)
)

The previous example returns a result specific to elasticsearch_dsl, but it is also possible to convert the elastic search result into a real Django query set, just be aware that this costs an SQL request to retrieve the model instances with the IDs returned by the elastic search query.

s = CarDocument.search().filter("term", color="blue")[:30]
qs = s.to_queryset()
# qs is just a django queryset and it is called with order_by to keep
# the same order as the elasticsearch result.
for car in qs:
print(car.name)
coma

Conclusion

To summarize, Elasticsearch integration with Django offers developers a powerful and scalable solution for implementing advanced search functionalities. With Elasticsearch’s robust search engine and Django’s versatile framework, developers can create intelligent and efficient search experiences.

The simplicity of Elasticsearch’s REST API and the convenience of the web-based console make it easy to index, search, and manage data. Leveraging the capabilities of Django Elasticsearch DSL further enhances the indexing of Django models. By combining Elasticsearch and Django, developers can unleash the full potential of search capabilities, delivering high-performance and user-friendly search functionalities in their applications.

Frequently Asked Questions

What is Elasticsearch?

Elasticsearch stands upon the foundation of the Apache Lucene search library, delivering a robust array of capabilities including distributed search functionality, real-time analytics prowess, and comprehensive support for diverse data types.

What are the advantages of using Elasticsearch?

Elasticsearch is a widely used open-source search and analytics engine that offers several advantages for various applications and use cases. Here are some of the key advantages of using Elasticsearch: High-Speed Searching and Indexing, Near Real-Time Data, Full-Text Search, Scalability, Horizontal and Vertical Scaling, Data Analysis and Visualization, Community and Support.

What is Django Elasticsearch DSL?

Django Elasticsearch DSL is a high-level Python library that serves as a bridge between the Django web framework and Elasticsearch. It provides a convenient and intuitive way to interact with Elasticsearch within Django applications, enabling developers to seamlessly integrate advanced search and querying capabilities.

Can you store a Django model in an Elasticsearch index?

Yes, you can store the data from a Django model in an Elasticsearch index using the Elasticsearch DSL library. Elasticsearch DSL allows you to define Elasticsearch document classes that map to your Django models, facilitating the indexing of your data into Elasticsearch indices.

Keep Reading

Keep Reading

Mindbowser is excited to meet healthcare industry leaders and experts from across the globe. Join us from Feb 25th to 28th, 2024, at ViVE 2024 Los Angeles.

Learn More

Let's create something together!