Ceph

Ship your Ceph Metrics via Telegraf to your Logit.io Stack

Follow the steps below to send your observability data to Logit.io

Metrics

Configure Telegraf to ship Ceph metrics to your Logit.io stacks

Install Integration

Please click on the Install Integration button to configure your stack for this source.

Install Telegraf

This integration allows you to configure a Telegraf agent to send your metrics to Logit.io.

Choose the installation method for your operating system:

Debian and Ubuntu users can install the latest stable version of Telegraf using the apt package manager.

The command line below will:

  • Download and install repository signing key
  • Configure the influxdata repository
  • Install telegraf
curl --silent --location -O \
https://repos.influxdata.com/influxdata-archive.key \
&& echo "943666881a1b8d9b849b74caebf02d3465d6beb716510d86a39f6c8e8dac7515  influxdata-archive.key" \
| sha256sum -c - && cat influxdata-archive.key \
| gpg --dearmor \
| sudo tee /etc/apt/trusted.gpg.d/influxdata-archive.gpg > /dev/null \
&& echo 'deb [signed-by=/etc/apt/trusted.gpg.d/influxdata-archive.gpg] https://repos.influxdata.com/debian stable main' \
| sudo tee /etc/apt/sources.list.d/influxdata.list
sudo apt-get update && sudo apt-get install telegraf

The default configuration file is location at:
/etc/telegraf/telegraf.conf

Configure Telegraf

The configuration file below is pre-configured to scrape the system metrics from your hosts, add the following code to the configuration file telegraf.conf from the previous step.

telegraf.conf
# Collects performance metrics from the MON, OSD, MDS and RGW nodes
# in a Ceph storage cluster.
[[inputs.ceph]]
  ## This is the recommended interval to poll. Too frequent and you
  ## will lose data points due to timeouts during rebalancing and recovery
  interval = '1m'
 
  ## All configuration values are optional, defaults are shown below
 
  ## location of ceph binary
  ceph_binary = "/usr/bin/ceph"
 
  ## directory in which to look for socket files
  socket_dir = "/var/run/ceph"
 
  ## prefix of MON and OSD socket files, used to determine socket type
  mon_prefix = "ceph-mon"
  osd_prefix = "ceph-osd"
  mds_prefix = "ceph-mds"
  rgw_prefix = "ceph-client"
 
  ## suffix used to identify socket files
  socket_suffix = "asok"
 
  ## Ceph user to authenticate as, ceph will search for the corresponding
  ## keyring e.g. client.admin.keyring in /etc/ceph, or the explicit path
  ## defined in the client section of ceph.conf for example:
  ##
  ##     [client.telegraf]
  ##         keyring = /etc/ceph/client.telegraf.keyring
  ##
  ## Consult the ceph documentation for more detail on keyring generation.
  ceph_user = "client.admin"
 
  ## Ceph configuration to use to locate the cluster
  ceph_config = "/etc/ceph/ceph.conf"
 
  ## Whether to gather statistics via the admin socket
  gather_admin_socket_stats = true
 
  ## Whether to gather statistics via ceph commands, requires ceph_user
  ## and ceph_config to be specified
  gather_cluster_stats = false
 
### System metrics
[[inputs.disk]]
[[inputs.net]]
[[inputs.mem]]
[[inputs.system]]
[[inputs.cpu]]
  percpu = false
  totalcpu = true
  collect_cpu_time = true
  report_active = true
 
### Output
[[outputs.http]]
  url = "https://@metricsUsername:@metricsPassword@@metrics_id-vm.logit.io:@vmAgentPort/api/v1/write"
  data_format = "prometheusremotewrite"
 
  [outputs.http.headers]
    Content-Type = "application/x-protobuf"
    Content-Encoding = "snappy"

Read more about how to configure data scraping and configuration options for Ceph (opens in a new tab)

Start Telegraf

For systemd installations use systemctl to start telegraf

sudo systemctl start telegraf

Launch Grafana to View Your Data

Launch Grafana

How to diagnose no data in Stack

If you don't see data appearing in your stack after following this integration, take a look at the troubleshooting guide for steps to diagnose and resolve the problem or contact our support team and we'll be happy to assist.

Telegraf Ceph Overview

To efficiently monitor and analyze Ceph metrics in a distributed environment, it's imperative to have a dependable and proficient metrics management solution. Telegraf, an open-source metrics collection agent, is perfectly suited for this task, capable of gathering Ceph metrics from a multitude of sources, including operational Ceph clusters, databases, and other relevant applications.

Telegraf offers an extensive assortment of input plugins, enabling users to collect metrics from various sources such as CPU usage, memory consumption, network activity, and more. For storing and analyzing these harvested metrics, organizations can make use of Prometheus, an open-source monitoring and alerting system celebrated for its flexible querying language and robust graphical data visualization capabilities.

To ship Ceph metrics from Telegraf to Prometheus, organizations need to configure Telegraf to output metrics in the Prometheus format, and then set up Prometheus to scrape these metrics from the Telegraf server. This procedure involves setting up Telegraf to collect Ceph metrics, outputting them in the Prometheus format, arranging Prometheus to retrieve these metrics from the Telegraf server, and then visually interpreting the data using Prometheus's dynamic querying and graphical visualization tools.

Once the metrics are successfully transferred into Prometheus, further analysis and visualization can be conducted using Grafana. Grafana is an open-source platform well-known for its monitoring and observability capabilities, and is fully compatible with Prometheus. It allows users to create dynamic, interactive dashboards for a deeper understanding of the metrics data, providing a comprehensive view of performance trends and potential issues.

If you need any further assistance with shipping your log data to Logit.io we're here to help you get started. Feel free to get in contact with our support team by sending us a message via live chat & we'll be happy to assist.