Get a DemoStart Free TrialSign In

Google Vertex AI Metrics via Telegraf

Ship your Google Vertex AI Metrics via Telegraf to your Logit.io Stack

Configure Telegraf to ship Google Vertex AI metrics to your Logit.io stacks via Logstash.

Send Your DataMetricsGoogle CloudGoogle Vertex AI Metrics via Telegraf Guide

Follow this step by step guide to get 'logs' from your system to Logit.io:

Step 1 - Set credentials in GCP

Google Vertex AI is a machine learning (ML) platform offered by Google Cloud that simplifies the development, deployment, and management of ML models. It provides a unified and collaborative environment for ML practitioners and data scientists to build and deploy ML solutions at scale.

  • Begin by heading over to the 'Project Selector' and select the specific project from which you wish to send metrics.
  • Progress to the 'Service Account Details' screen. Here, assign a distinct name to your service account and opt for 'Create and Continue'.
  • In the 'Grant This Service Account Access to Project' screen, ensure the following roles: 'Compute Viewer', 'Monitoring Viewer', and 'Cloud Asset Viewer'.
  • Upon completion of the above, click 'Done'.
  • Now find and select your project in the 'Service Accounts for Project' list.
  • Move to the 'KEYS' section.
  • Navigate through Keys > Add Key > Create New Key, and specify 'JSON' as the key type.
  • Lastly, click on 'Create', and make sure to save your new key.

Now add the environment variable for the key

On the machine run:

export GOOGLE_APPLICATION_CREDENTIALS=<your-gcp-key>

Step 2 - Install Telegraf

This integration allows you to configure a Telegraf agent to send your metrics, in multiple formats, to Logit.io.

Telegraf is a flexible server agent equipped with plug-in support, useful for sending metrics and events from data sources like web servers, APIs, application logs, and cloud services.

To ship your metrics to Logit.io, we will integrate the relevant input and outputs.http plug-in into your Telegraf configuration file.

Choose the install for your operating system below to get started:

Windows

wget https://dl.influxdata.com/telegraf/releases/telegraf-1.19.2_windows_amd64.zip

Download and extract to: C:\Program Files\Logitio\telegraf\

Configuration file: C:\Program Files\Logitio\telegraf\

MacOS

brew install telegraf

Configuration file x86_64 Intel: /usr/local/etc/telegraf.conf Configuration file ARM (Apple Silicon): /opt/homebrew/etc/telegraf.conf

Ubuntu/Debian

wget -q https://repos.influxdata.com/influxdata-archive_compat.key
echo '393e8779c89ac8d958f81f942f9ad7fb82a25e133faddaf92e15b16e6ac9ce4c influxdata-archive_compat.key' | sha256sum -c && cat influxdata-archive_compat.key | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg > /dev/null
echo 'deb [signed-by=/etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg] https://repos.influxdata.com/debian stable main' | sudo tee /etc/apt/sources.list.d/influxdata.list

sudo apt-get update
sudo apt-get install telegraf

Configuration file: /etc/telegraf/telegraf.conf

RedHat and CentOS

cat <<EOF | sudo tee /etc/yum.repos.d/influxdata.repo
[influxdata]
name = InfluxData Repository - Stable
baseurl = https://repos.influxdata.com/stable/\$basearch/main
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdata-archive_compat.key
EOF

sudo yum install telegraf

Configuration file: /etc/telegraf/telegraf.conf

SLES & openSUSE

zypper ar -f obs://devel:languages:go/ go
zypper in telegraf

Configuration file: /etc/telegraf/telegraf.conf

FreeBSD/PC-BSD

sudo pkg install telegraf

Configuration file: /etc/telegraf/telegraf.conf

Read more about how to configure data scraping and configuration options for Telegraf

Step 3 - Configure the Telegraf input plugin

First you need to set up the input plug-in to enable Telegraf to scrape the GCP data from your hosts. This can be accomplished by incorporating the following code into your configuration file:

# Gather timeseries from Google Cloud Platform v3 monitoring API
[[inputs.stackdriver]]
  ## GCP Project
  project = "<your-project-name>"

  ## Include timeseries that start with the given metric type.
  metric_type_prefix_include = [
    "aiplatform.googleapis.com",
  ]

  ## Most metrics are updated no more than once per minute; it is recommended
  ## to override the agent level interval with a value of 1m or greater.
  interval = "1m"
Read more about how to configure data scraping and configuration options for Stackdriver

Step 4 - Configure the output plugin

Once you have generated the configuration file, you need to set up the output plug-in to allow Telegraf to transmit your data to Logit.io in Prometheus format. This can be accomplished by incorporating the following code into your configuration file:

[[outputs.http]]
  
  url = "https://<your-metrics-username>:<your-metrics-password>@<your-metrics-stack-id>-vm.logit.io:0/api/v1/write"
  data_format = "prometheusremotewrite"

  [outputs.http.headers]
    Content-Type = "application/x-protobuf"
    Content-Encoding = "snappy"

Step 5 - Start Telegraf

Windows

telegraf.exe --service start

MacOS

telegraf --config telegraf.conf

Linux

sudo service telegraf start

for systemd installations

systemctl start telegraf

Step 6 - View your metrics

Data should now have been sent to your Stack.

View my data

If you don't see metrics take a look at How to diagnose no data in Stack below for how to diagnose common issues.

Step 7 - How to diagnose no data in Stack

If you don't see data appearing in your Stack after following the steps, visit the Help Centre guide for steps to diagnose no data appearing in your Stack or Chat to support now.

Step 8 - Telegraf Google Vertex AI Platform metrics Overview

Integrating Telegraf with Google Vertex AI allows organizations to monitor the performance of their AI models and the ML infrastructure. This includes tracking the usage metrics, operation statuses, and performance indicators of deployed models. Such metrics are crucial for ensuring the models perform as expected in production environments, providing insights into model accuracy, latency, and throughput.

However, the intricate nature and sheer volume of metrics generated by ML models and infrastructure pose a significant challenge for analysis and management. Logit.io offers a powerful solution to these challenges, providing a platform adept at handling the complex data landscape of AI and ML.

Logit.io enhances the ability to monitor and analyze ML model performance and infrastructure health, offering features like real-time alerting, comprehensive dashboards, and in-depth analytics. This enables organizations to quickly identify and address issues, optimize model performance, and make data-driven decisions to improve their AI initiatives.

For those utilizing Telegraf in conjunction with Google Vertex AI and seeking advanced analytics capabilities, Logit.io is an essential solution. Our platform streamlines the management of ML metrics, facilitating a deeper understanding of model behavior and operational efficiency. Effortlessly enhance your machine learning and AI workloads by synchronously sending crucial data from Google Vertex AI to Logit.io. This integration grants you full control and visibility over your systems, enabling you to monitor model performance, track data changes, and gain insights into system behaviour. Why not explore our comprehensive integration on Google Kubernetes Engine Logs to optimize your Kubernetes deployments, troubleshoot issues in real-time, and fine-tune cluster performance? Additionally, dive into Google Cloud GKE Metrics to gain a comprehensive understanding of your Kubernetes environment, allowing you to track resource utilization and monitor cluster health. You'll find that these integrations operate flawlessly within Logit.io's GCP logging service.

Return to Search
Sign Up

© 2024 Logit.io Ltd, All rights reserved.