As distributed environments become more complex, users often use distributed tracing tools to improve the visibility of issues evident within their traces.
Throughout this post, we will examine some of the best open-source and other generally popular distributed tracing tools available today.
Uber Technologies released Jaeger's distributed tracing system, as part of their open-source initiative, as a fully open sourced project in 2015. Using Jaeger, you can monitor and troubleshoot distributed systems. Like Kubernetes and Thanos, Jaeger is a Cloud Native Computing Foundation (CNCF) graduate project.
Zipkin is an open-source distributed tracing system that helps troubleshoot latency problems. In addition to collecting trace data, Zipkin can also be used to look up trace data. Based on the Google Dapper papers, Zipkin was originally developed at Twitter in 2010 and based upon the Java framework.
Zipkin is leveraged by many leading companies across a variety of industries including Postmates, Uber and TransferWise, among many others.
Using Logit.io's distributed tracing solution, you will be able to trace key events and see how efficiently resources are being utilised across any application, no matter the complexity. Logit.io also allows you to visualise metrics, logs, events, and traces from your applications.
Any metrics that you collect can be used to build dashboards, reports, and alerts, all within a single platform, once you have used one of our simple data forwarders to start sending your data. Aside from trace observability, the platform is also used for use cases such as log management, infrastructure monitoring & deep metrics analysis.
4. New Relic
With New Relic, you can send logs from AWS, Microsoft Azure, and other leading cloud providers. New Relic was founded in 2008, so they have extensive experience working in the log management market. They also provide distributed tracing, instant observability, and synthetics monitoring as well.
For effective parsing, archiving and monitoring, Datadog's log management solution separates log ingestion from indexing. Besides metrics management and application analysis, the solution also offers synthetics monitoring and device monitoring.
When discussing the platform's APM features, users often appreciate Datadog's ability to collect and ingest a multitude of data sources as well as its large number of data points which then inform intuitive dashboards. Datadog's distributed tracing features help users debug application performance issues in real-time and better understand the impact services are having on users via error, latency, and high-value traces.
The Sentry open-source application monitoring tool helps you identify errors and performance bottlenecks within your code. As well as monitoring separate services or applications, Sentry's distributed tracing service also enables the platform to stitch together related user instances from different sources. This provides a very convenient overview of the application state at each checkpoint a user passes through.
Using Sentry, you can also track performance issues, identify poor API calls, and pinpoint slow database queries.
Dynatrace simplifies cloud complexity so that organisations can move toward digital transformation faster. Although, getting onboard with Dynatrace and their distributed tracing offering can be a challenging process because of the sheer amount of documentation required to learn can make it difficult to get the most out of the platform.
It may be worth comparing Dynatrace's onboarding process with other services to see how fast you can begin monitoring your applications after registration since many of its competitors offer simpler onboarding experiences.
APM and SIEM are two of the main services offered by Splunk. Its platform is well known among engineers for its capabilities to handle large-scale projects (for instance, managing more than 200,000 devices). Their application performance monitoring solution offers distributed tracing capabilities as standard.
A high-performance observability platform addressing enterprise users, Splunk is the original proprietary "data to everything" platform. As well as monitoring microservices in production, Splunk can be used to monitor environments across production, test, and development environments.
Full-stack observability offered by AppDynamics enhanced operational visibility, making it a suitable tool for eliminating visibility silos arising from microservice architecture.
In addition to the high cost associated with deployment and configuration, Appdymanics has previously received criticism for its lack of platform support options.
Monitoring production servers and troubleshooting user experiences are just two of the features Honeycomb can handle. It is worth knowing that if you do go over your plan limits with Honeycomb, your data will not be automatically lost since they give users overusage to increase their plans (similar to Logit.io).
Honeycomb has a number of built-in distributed tracing capabilities that make it a very useful platform for developers. Honeycomb's distributed tracing features are designed specifically for the monitoring of microservices as well as making their performance bottlenecks much more transparent to anyone analysing their system.
The Wavefront monitoring and analytics platform offer 3D observability with metrics, histograms, and OpenTracing-compatible distributed tracing capabilities on a single platform. In 2017, Wavefront was acquired by VMware and now provides a high-performance observability platform that enables users to monitor, visualise, and analyse their distributed application environment.
One of the main benefits of Wavefront is its ability to handle data from any cloud infrastructure. Due to this, it is used to alert on, troubleshoot, and optimise the performance of both multi-cloud and hybrid-cloud environments.
12. Grafana Tempo
Grafana Tempo is an open-source, high-volume, minimal dependency distributed tracing backend that is built on top of the Grafana Framework. Using Tempo can be convenient, as it requires only object storage to run, along with being fully integrated with Prometheus, as well as Loki. Tempo is capable of ingesting data from a number of popular open source tracing protocols, such as Jaeger, Zipkin, and OpenTelemetry, which are all widely used today.
Another benefit of Tempo is that it can be deployed at a massive scale without requiring any Elasticsearch or Cassandra clusters (which can easily become quite tedious to maintain). Furthermore, Tempo can be used with Kafka and legacy tooling such as OpenCensus.
With Uptrace, teams can deploy code with confidence, understand complex systems, and save time and money with their maintenance-free, distributed tracing solution. As part of the Uptrace system, OpenTelemetry is used to collect the data and a ClickHouse database is used to store that data.
With Uptrace, users can pinpoint problems in complex distributed systems and find performance bottlenecks, as well as work with many petabytes worth of data. There are plans to make Uptrace an open-source tool in its own right, as mentioned on their official GitHub repository and according to the terms of their current licensing.
Kamon consists of a set of tools used for instrumenting applications running on Java Virtual Machines (JVMs). The Kamon platform allows you to use metrics, tracing, and context propagation APIs all whilst being completely vendor agnostic.
The Kamon APIs are completely decoupled from the services that can receive the data, so you can instrument your application once and report going forward on the data no matter the connected service. Whether the service you are currently running is StatsD, Prometheus, Kamino, Datadog, Zipkin or Jaeger, Kamon integrates with all of these services.
With Instana, you get a fully automated solution for application performance management for cloud-native apps. It is important to note that Instana's AutoTrace solution is a distributed tracing and service discovery technology that supports multiple technologies simultaneously, including .NET, Clojure, Kotlin, PHP, Python, and Scala.