How To Improve Speed & Reliability Of A Product Using Monitoring Tools

In the competitive world of products, organizations can’t bear the cost of a momentous downtime or a lesser than par performance. Execution issues can harm a brand and small incidents can lead to long-term effects. For example, one of our customers who runs a ticketing platform gets 100X times of normal day traffic during NBL finals and other sports showcase events. If a glitch in performance were to happen at that time, it will not only impact their brand at that moment but also snowball quickly to the long-term loss for the brand because next time, the audience may choose an alternative brand.

Similarly, organizations can’t take risks related to security, access control, compliance, etc. If not handled properly, companies can end up with hefty fines, which can sometimes even break the business. Keeping this in mind, we always advocate for proper monitoring and management of IT during and post product development. In this blog, we share further about the need for monitoring and how we have leveraged OpenDistro for ElasticSearch to build a recipe for modern IT infrastructure monitoring and management.

You Can Check Out Our Detailed Video Below

Monitoring systems are responsible for gathering data from the interfaces used by a product (hardware, networks, applications) in order to analyze their operation and performance and to detect and alert about possible errors. A good monitoring system is able to monitor devices, infrastructures, applications, services, and even business processes.

Just like one cannot cross a maze blindfolded, a company cannot traverse through the IT risks without a proper plan and strategy. Similar to data, we also analyze infrastructure using event logs, resource metrics, APM (Application performance traces). A comprehensive strategy comprises both analysis and analytics. An analysis is done on past information, while analytics is built to comprehend future trends.

So, How do we leverage Monitoring tools? At Mindbowser, we use Elastic stack for IT infrastructure Monitoring. A log management platform that is mainly focused on searching, analyzing and visualizing logs generated from various distributed systems.

It mainly contains 4 components

  • Elasticsearch: Full-text search and analysis engine for storing data
  • Logstash: It retrieves log data from different input sources, executes various transformations and improvements and then ships the data to various supported output targets
  • Kibana: It is a visualization layer that works on top of Elasticsearch, providing users with the ability to analyze and visualize the data.
  • Beats: Lightweight agents that are installed on nodes to collect different types of data for forwarding into the stack.

The following image shows how the different aspects of the solution work together

Elastic Stack architecture for monitoring managementFig: Elastic Stack architecture for monitoring management

Beat’s agents are installed on servers to gather logs, metrics and network packet data. The data is then transformed and shipped through logstash servers, where the data is transformed, tagged, and then sent to elastic search nodes to be indexed

Open Distro for Elasticsearch is an open-source alternative for Elasticsearch and Kibana with a large number of open-source plugins. These plugins add further utility and can be used as per needs.

Additionally, we use Wazuh for security and compliance monitoring. Wazuh is a free, open-source project forked from OSSEC which is a host-based intrusion detection system (HIDS). It is an enterprise-ready security monitoring solution for threat recognition, integrity monitoring, incident response, and compliance.

Kibana has Wazuh modules to envision various threats and compliance reports and security events etc. The biggest advantage of our custom monitoring solution is it is flexible, scalable with no vendor lock-in and no license cost. Free large community support and trusted by a huge number of big business clients.

Following are the things that we help our customers with

  • Security Analytics
  • Intrusion Detection
  • Log Data Analysis
  • File integrity Monitoring
  • Vulnerability Detection
  • Configuration Assessment
  • Incident Response
  • Regulatory Compliance
  • Cloud Security Monitoring

For deployment of monitoring setup, we have written a custom script to be able to easily set up the environment. The setup combines elastic stack with Wazuh server installation and includes agent installation on application servers. We also added preconfigured kibana dashboards, alerts, users, notification integration etc., which makes the solution effective from day one.

What Issues Do We Solve Using Monitoring Tools?

As we know, every product requires best practices in development, scalability, security, compliance etc., throughout its cycle. Monitoring tools help in managing IT infrastructure through audit logs and performance alerts as well and monitor adherence to compliance related to GDPR, PCI and HIPAA. We use the tool Filebeat for log aggregation and Metricbeat for metrics data aggregation.

  • Log Monitoring

Logs from the system are used for monitoring file integrity, capturing unwanted traffic to the server, troubleshooting etc and application logs are maintained for compliance and debugging purposes.

Application logs of all environments are in one place so that they are easily available to developers or QA engineers. This way, the team does not have to give ssh access to application server developers. This can restrict unwanted activities on the server and save time also.

We have created dashboards for application logs and container logs in Kibana and have also restricted access to environments based on roles.

Containers logs from to different environments in a single dashboardFig: Containers logs from different environments in a single dashboard

  • Resources Monitoring

Metrics data from the system is used to visualize system resources and performance in the Kibana dashboard. We can set alerts on specific events such as high disk usage.

Dashboards for various resources can be very useful to make decisions to maintain the system as well as provide a quick birds eye view on the status of things so as to be able to know where we can improve or need to scale up/down the resources. This way we guarantee that applications are accessible, performing and secured consistently.

Sample resources monitoring dashboardFig: Sample resources monitoring dashboard

Sample alert dashboardFig: Sample alert dashboard

  • Wazuh – The Open Source Security Platform

Wazuh gives us reports on our system security events like failed ssh attempts, brute force attacks, modified files, compliance report, security configuration assessment, vulnerabilities etc.

Wazuh dashboardFig: Wazuh dashboard

Security events dashboardFig: Security events dashboard

coma

Conclusion

In all, we have discussed creating a strong and stable monitoring system. As highly available cloud-native infrastructure and application workloads become more available and secure, more care needs to be taken to get the monitoring systems right and to be sure that you are using dependable metrics to dynamically manage your environment.

These monitoring tools search for things like outside threats, misconfigured settings, unreasonably lenient jobs and permissions and consistency with norms like PCI, HIPAA, GDPR. Together, these are items that give full security to the DevOps cycle of your product. In the end, it is one’s choice to use a tool based on their needs and their infrastructure. All tools in the above article are open-source tools and are used by various organizations for their monitoring purposes.

Keep Reading

Keep Reading

  • Service
  • Career
  • Let's create something together!

  • We’re looking for the best. Are you in?