Configuración avanzada de Elasticsearch

22/07/2022 Rodrigo Alonso Aviles

Elasticsearch es un motor de búsqueda y análisis que permite almacenar documentos (estructurados o no) e indexar todos los campos de estos documentos en casi tiempo real. Es distribuible y fácilmente escalable, enfocado sobre todo al mundo empresarial y científico. Es accesible a través de una extensa y elaborada API. Con esta herramienta podemos impulsar búsquedas extremadamente rápidas que respalden nuestras aplicaciones de descubrimientos de datos.

Su principal uso reside dentro de la monitorización de los logs distribuidos, formando parte del stack EFG (Elasticsearch, Fluentd y Grafana).

El principal objeto que se puede definir y manipular en Elasticsearch son los índices. Estos índices son colecciones optimizadas de documentos JSON, los cuales, cada uno, son una colección de campos clave-valor que contienen los datos.

Dentro de los índices nos encontramos los documentos. Estos documentos, como su propio nombre indica, son documentos JSON que se almacenan en Elasticsearch dentro de un índice, con un id y un tipo concretos. Pueden contener de 0 a n campos clave-valor.

Para poder gestionar tanto los índices como los documentos de la instancia de Elasticsearch, se puede hacer uso dela imagen de Gestión de Índices en Elastic (API).

Configuración de Elasticsearch

La configuración de Elasticsearch viene dada, bajo la ruta «/usr/share/elasticsearch/config» donde nos encontraremos los siguientes ficheros:

elasticsearch.keystore
elasticsearch.yml
jvm.options
jvm.options.d
log4j2.properties
role_mapping.yml
roles.yml
users
users_roles

Configuración en Openshift

Dentro de Openshift, para poder realizar cambios en estos ficheros y que se nos guarden para futuros cambios y/o reinicios del pod, es necesario crear un PVC:

elastic-config.yaml 
 
kind: PersistentVolumeClaim 
apiVersion: v1 
metadata: 
  name: elastic-config 
  namespace: <namespace> 
spec: 
  accessModes: 
    - ReadWriteOnce 
  resources: 
    requests: 
      storage: 100Mi

Y dentro del deployment del Elastic, asignarle los siguientes campos:

elasticsearch-deployment.yaml  

... 
spec: 
template: 
spec: 
      		volumes: 
        		- name: elasticdb-config 
        		  persistentVolumeClaim: 
         		  	claimName: elastic-config 
      		containers: 
        		- resources: 
volumeMounts: 
            			- name: elasticdb-config 
              			  mountPath: /usr/share/elasticsearch/config

Modificación de los recursos asignados a la JVM

Por necesidades del proyecto, es posible que la carga que reciba Elasticsearch sea superior a la esperada por defecto por el programa. En estos casos, una de las posibles soluciones puede ser el aumento de la memoria asignada a la máquina virtual de Java sobre la que está montada Elasticsearch.

Para ello, accederemos al fichero «jvm.options» mencionado anteriormente y modificaremos los siguientes parámetros:

/usr/share/elasticsearch/config/jvm.options

-Xms3g

-Xmx3g

-Xms → Indica el tamaño mínimo de la JVM.
-Xmx → Indica el tamaño máximo de la JVM.

Por defecto, ambos parámetros vienen con el valor de 1g.

Valor -Xms y -Xmx

Como recomendación, el valor de ambas variables debería ser el mismo y no mayor que el 50% de tu RAM. Para más información, consultar la documentación oficial.

Modificación de los buckets

Hay situaciones en las que a Elastic se le realizan peticiones las cuales tienen que devolver un número demasiado grande de datos que contengan agregaciones (p.e: Desde Grafana se realiza una query de datos de las ultimas dos semanas en las cuales ha habido un gran número de inserciones de índices en Elastic). En estos casos, existe una variable dentro del fichero elasticsearch.yml la cual sirve para modificar este parámetro:

/usr/share/elasticsearch/config/elasticsearch.yml

search.max_buckets: 20000

Por defecto, el valor es de 10000.

✍🏻 Author(s)

Rodrigo Alonso Aviles

See author's posts

Cookie	Duración	Descripción
__cfruid	session	Cloudflare sets this cookie to identify trusted web traffic.
connect.sid	1 day	This cookie is used for authentication and for secure log-in. It registers the log-in information.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duración	Descripción
pll_language	1 year	The pll _language cookie is used by Polylang to remember the language selected by the user when returning to the website, and also to get the language information when not available in another way.
ugid	1 year	This cookie is set by the provider Unsplash. This cookie is used for enabling the video content on the website.

Cookie	Duración	Descripción
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_127650363_5	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.

Cookie	Duración	Descripción
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.

Cookie	Duración	Descripción
atlassian.account.ffs.id	1 year	No description available.
atlassian.account.xsrf.token	session	No description available.
cloud.session.token	past	No description
pvc_visits[0]	1 hour	This cookie is created by post-views-counter. This cookie is used to count the number of visits to a post. It also helps in preventing repeat views of a post by a visitor.
SESSION	session	No description

Configuración de Elasticsearch

Configuración en Openshift

Modificación de los recursos asignados a la JVM

Modificación de los buckets

✍🏻 Author(s)

Rodrigo Alonso Aviles

También te puede gustar

Programación reactiva: Spring WebFlux

Conectando con Google BigQuery: Agentes JDBC

Extendiendo las capacidades de los Dashboards con JavaScript

Deja una respuesta Cancelar la respuesta