Advanced Elasticsearch Configuration
Elasticsearch is a search and analysis engine that allows you to store documents (either structured or not) and index all the fields of these documents in near real time. It is distributable and easy to scale, focusing mainly on the business and scientific worlds. It is accessible through an extensive and elaborate API. With this tool, we can perform extremely fast searches that support our data discovery applications.
Its main use is the monitoring of distributed logs, forming part of the EFG stack (Elasticsearch, Fluentd and Grafana).
The main objects that can be defined and manipulated in Elasticsearch are indexes. These indexes are optimized collections of JSON documents, each of which is a collection of key-value fields that contain the data.
Within the indexes, we find the documents. These documents, as their name suggests, are JSON documents that are stored in Elasticsearch inside an index, with a specific id and type. Each can contain from 0 to n key-value fields.
To manage both the indexes and the documents of the Elasticsearch instance, we can use the Elastic Index Management image (API).
Elastic search configuration
The Elasticsearch configuration is provided, under the path «/usr/share/elasticsearch/config», where we will find the following files:
Configuration in Openshift
Within Openshift, to be able to make changes to these files and have them saved for future changes and/or restarts of the pod, we need to create a PVC:
elastic-config.yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: name: elastic-config namespace: <namespace> spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Mi
And within the Elastic deployment, assign the following fields:
elasticsearch-deployment.yaml ... spec: template: spec: volumes: - name: elasticdb-config persistentVolumeClaim: claimName: elastic-config containers: - resources: volumeMounts: - name: elasticdb-config mountPath: /usr/share/elasticsearch/config
Modifying the resources assigned to the JVM
Due to the specific needs of the project, it is possible that the load that Elasticsearch receives is greater than that expected by default by the program. In these cases, one of the possible solutions may be to increase the memory allocated to the Java virtual machine on which Elasticsearch is mounted.
To do this, we will access the «jvm.options» file mentioned above and modify the following parameters:
- -Xms → Indica el tamaño mínimo de la JVM.
- -Xmx → Indica el tamaño máximo de la JVM.
By default, both parameters have the value 1g.
-Xms and -Xmx value.
As a recommendation, the value of both variables should be the same and never greater than 50% of your RAM. For more information, check the official documentation.
Modification of the buckets
There are situations in which the requests made to Elastic, have to return an excessively large number of data containing aggregations (e.g.: From Grafana, we make a data query from the last two weeks, in which there has been a large number of index inserts in Elastic). In these cases, there is a variable within the elasticsearch.yml file, which is used to modify this parameter:
By default, its value is 10000.