Introduction to Service Mesh

19/01/2022 jonsurbe

In today’s entry we would like to introduce the concept of «Service Mesh» in Kubernetes, and compare various Open Source implementations available.

According to RedHat, a Service Mesh is utilized to control the ways the different parts of an application share data between each other. Unlike other systems that also allow administrating that communication, the net services is an infrastructure layer specifically integrated with the application. This layer is visible and has the capacity to register if the interaction between the different parts of an app is good or bad. This way, it facilitates the optimization of communications and prevents downtime tailored to the application growth.

A Service Mesh can help us in the detection and fixation of problems in a productive environment that features:

An architecture with multiple microservices communicated between them.
That has good exterior communication.
That needs an versioning strategy of microservices to publish new characteristics in mode A/B Testing o friends and family,

For that, those nets provide the next abilities:

Securing communication between microservices, with patterns like circuit-breaker, retries, error management, intelligent load management and error tolerance.
Service discovery inside a cluster or even outside it.
Routing between different versions or implementation of a service.
Observability, recollection of logas and distributed traces, quality metrics and monitoring.
Securization of communications between microservices, doing the TLS termination.

All these functionalities are possible thanks to the injection of a sidecar -in a lot of cases Envoy Proxy– which allows adding these characteristics to the existing software in a practically seamless way.

Apart from these basic characteristics, it is also important to revise that the Service Mesh adopted provides the next:

Multi protocol: TCP, HTTP 1/2, gRPC, etc.
Direct support with our Cloud/Kubernetes provider: which will allow an simple implantation.
Traffic Control: metrics, spliting y traffic control.
Integration with Grafana/Kibana.

Having that in mind, there’s different available options that are Open Source which we could use. The main ones would be:

What are the differences between one another? Let’s see it in the next comparative table:

	Linkerd	Istio	Consul Connect	Kuma	Traefik Mesh
License	Apache	Apache	Apache	Apache	Apache
Service Proxy	Linkerd Proxy	Envoy	Envoy	Envoy	Traefik
Log harvest	No	No	No	No	No
Integrated Prometheus	Yes	Yes, with an extension	No	Yes	Yes
Integrated Grafana	Yes	Yes, with an extension	No	Yes	Yes
Distributed traces	Yes, Open Telemetry	Yes, Open Telemetry	Datadog, Jaeger, Zipkin	Yes, Open Telemetry	Jaeger
Advanced load management	Yes	Yes	Yes	Yes	Yes
Resilience patterns	Cicuit Breaking Retries Error Injection Delay injection	Cicuit Breaking Retries Error Injection Delay injection	Cicuit Breaking Retries	Cicuit Breaking Retries Error Injection Delay injection	Cicuit Breaking Retries
TLS Termination	Yes	Yes	Yes, using the Vault of Hashicorp	Yes	Yes
Protocols	Http 1/2, tcp, grpc	Http 1/2, tcp, grpc	Http 1/2, tcp, grpc	Http 1/2, tcp, grpc	Http 1/2, tcp, grpc

Based what this, we can discard those that don’t support Open Telemetry for being the standard, and those that don’t cover all of the resilience patterns nowadays. Another important factor is the use of Envoy as a Service Proxy, since it is the implementation reference.

Therefore, our recommendation is the use of Kuma or Istio, the last one being the preferred one for having integration and support in our reference Openshift and GKE in those products in productive environments.

We hope you have been of interest, and if you have any questions, please leave us a comment.