Introduction to Service Mesh
In today’s entry we would like to introduce the concept of «Service Mesh» in Kubernetes, and compare various Open Source implementations available.
According to RedHat, a Service Mesh is utilized to control the ways the different parts of an application share data between each other. Unlike other systems that also allow administrating that communication, the net services is an infrastructure layer specifically integrated with the application. This layer is visible and has the capacity to register if the interaction between the different parts of an app is good or bad. This way, it facilitates the optimization of communications and prevents downtime tailored to the application growth.
A Service Mesh can help us in the detection and fixation of problems in a productive environment that features:
- An architecture with multiple microservices communicated between them.
- That has good exterior communication.
- That needs an versioning strategy of microservices to publish new characteristics in mode A/B Testing o friends and family,
For that, those nets provide the next abilities:
- Securing communication between microservices, with patterns like circuit-breaker, retries, error management, intelligent load management and error tolerance.
- Service discovery inside a cluster or even outside it.
- Routing between different versions or implementation of a service.
- Observability, recollection of logas and distributed traces, quality metrics and monitoring.
- Securization of communications between microservices, doing the TLS termination.
All these functionalities are possible thanks to the injection of a sidecar -in a lot of cases Envoy Proxy– which allows adding these characteristics to the existing software in a practically seamless way.
Apart from these basic characteristics, it is also important to revise that the Service Mesh adopted provides the next:
- Multi protocol: TCP, HTTP 1/2, gRPC, etc.
- Direct support with our Cloud/Kubernetes provider: which will allow an simple implantation.
- Traffic Control: metrics, spliting y traffic control.
- Integration with Grafana/Kibana.
Having that in mind, there’s different available options that are Open Source which we could use. The main ones would be:
What are the differences between one another? Let’s see it in the next comparative table:
Linkerd | Istio | Consul Connect | Kuma | Traefik Mesh | |
License | Apache | Apache | Apache | Apache | Apache |
Service Proxy | Linkerd Proxy | Envoy | Envoy | Envoy | Traefik |
Log harvest | No | No | No | No | No |
Integrated Prometheus | Yes | Yes, with an extension | No | Yes | Yes |
Integrated Grafana | Yes | Yes, with an extension | No | Yes | Yes |
Distributed traces | Yes, Open Telemetry | Yes, Open Telemetry | Datadog, Jaeger, Zipkin | Yes, Open Telemetry | Jaeger |
Advanced load management | Yes | Yes | Yes | Yes | Yes |
Resilience patterns | Cicuit Breaking Retries Error Injection Delay injection | Cicuit Breaking Retries Error Injection Delay injection | Cicuit Breaking Retries | Cicuit Breaking Retries Error Injection Delay injection | Cicuit Breaking Retries |
TLS Termination | Yes | Yes | Yes, using the Vault of Hashicorp | Yes | Yes |
Protocols | Http 1/2, tcp, grpc | Http 1/2, tcp, grpc | Http 1/2, tcp, grpc | Http 1/2, tcp, grpc | Http 1/2, tcp, grpc |
Based what this, we can discard those that don’t support Open Telemetry for being the standard, and those that don’t cover all of the resilience patterns nowadays. Another important factor is the use of Envoy as a Service Proxy, since it is the implementation reference.
Therefore, our recommendation is the use of Kuma or Istio, the last one being the preferred one for having integration and support in our reference Openshift and GKE in those products in productive environments.
We hope you have been of interest, and if you have any questions, please leave us a comment.