Resources, How To Guides, ELK
6 min read
Kubernetes has revolutionized how we deploy and manage containerized applications, but with this complexity comes the challenge of effective logging across distributed systems. When running hundreds or thousands of containers across multiple nodes, traditional logging approaches quickly become inadequate. This is where the ELK Stack (Elasticsearch, Logstash, Kibana) combined with Kubernetes-native logging solutions becomes essential for maintaining visibility into your containerized applications. In this comprehensive guide, we'll explore the best practices for implementing Kubernetes logging with the ELK Stack, covering everything from log collection strategies to advanced filtering and visualization techniques. Whether you're just getting started with Kubernetes logging or looking to optimize your existing setup, these practices will help you build a robust, scalable logging infrastructure that provides the insights you need to maintain healthy, performant applications.
Contents
Understanding Kubernetes Logging Challenges
Kubernetes introduces unique logging challenges that traditional monolithic applications don't face. Containers are ephemeral by nature, meaning they can be created, destroyed, and moved between nodes at any time. This transient nature makes it difficult to maintain persistent log storage and creates challenges for log aggregation across your entire cluster.
Additionally, Kubernetes applications are typically distributed across multiple pods, services, and namespaces, each generating their own logs. Without proper logging infrastructure, you'll find yourself manually SSH-ing into individual nodes to check logs, which is neither scalable nor efficient for production environments.
The key challenges include:
- Ephemeral containers: Logs are lost when containers are terminated
- Distributed architecture: Logs are scattered across multiple nodes and pods
- High volume: Microservices generate massive amounts of log data
- Complex filtering: Need to correlate logs across services and namespaces
- Performance impact: Logging shouldn't affect application performance
Setting Up ELK Stack for Kubernetes
The ELK Stack provides a powerful foundation for Kubernetes logging. Elasticsearch serves as the search and analytics engine, Logstash handles log processing and transformation, and Kibana offers visualization and dashboard capabilities. When properly configured, this stack can handle the high-volume, distributed nature of Kubernetes logging while providing powerful search and analysis capabilities.
To get started with ELK Stack in Kubernetes, you'll need to deploy the components using Helm charts or Kubernetes manifests. Here's a basic setup using Helm:
# Add the Elastic Helm repository helm repo add elastic https://helm.elastic.co helm repo update
Install Elasticsearch
helm install elasticsearch elastic/elasticsearch
--namespace logging
--create-namespace
--set replicas=3
--set minimumMasterNodes=2Install Kibana
helm install kibana elastic/kibana
--namespace logging
--set elasticsearchHosts=http://elasticsearch-master:9200Install Logstash
helm install logstash elastic/logstash
--namespace logging
--set elasticsearchHosts=http://elasticsearch-master:9200
Log Collection Strategies
Effective log collection in Kubernetes requires a multi-layered approach. The most common strategies include node-level logging, sidecar containers, and centralized logging agents. Each approach has its benefits and trade-offs, and often a combination of strategies works best for complex environments.
Node-Level Logging
Node-level logging involves running a logging agent on each Kubernetes node that collects logs from all containers running on that node. This approach is efficient and provides good coverage, but it requires careful configuration to avoid performance issues.
Fluentd and Fluent Bit are popular choices for node-level logging due to their lightweight nature and Kubernetes-native design. Here's an example DaemonSet configuration for Fluent Bit:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluent-bit
namespace: logging
spec:
selector:
matchLabels:
name: fluent-bit
template:
metadata:
labels:
name: fluent-bit
spec:
containers:
- name: fluent-bit
image: fluent/fluent-bit:latest
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: fluent-bit-config
mountPath: /fluent-bit/etc/
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: fluent-bit-config
configMap:
name: fluent-bit-config
Sidecar Logging
Sidecar logging involves adding a logging container to each pod that handles log collection for the application container. This approach provides more granular control and can be customized per application, but it increases resource usage and complexity.
Sidecar logging is particularly useful for applications that write logs to specific files or use custom log formats. The sidecar container can handle log parsing, filtering, and forwarding while the main application focuses on business logic.
Log Processing and Enrichment
Raw Kubernetes logs often lack context and structure needed for effective analysis. Log processing and enrichment add metadata, parse structured data, and transform logs into a format that's easier to search and analyze.
Logstash provides powerful processing capabilities through its pipeline configuration. Here's an example Logstash configuration that processes Kubernetes logs:
input { beats { port => 5044 } }
filter { if [kubernetes] { mutate { add_field => { "cluster_name" => "${CLUSTER_NAME}" "namespace" => "%{[kubernetes][namespace]}" "pod_name" => "%{[kubernetes][pod][name]}" "container_name" => "%{[kubernetes][container][name]}" } } }
grok { match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:message}" } }
date { match => [ "timestamp", "ISO8601" ] } }
output { elasticsearch { hosts => ["elasticsearch-master:9200"] index => "kubernetes-logs-%{+YYYY.MM.dd}" } }
Best Practices for Kubernetes Logging
Implementing effective Kubernetes logging requires following several best practices that ensure your logging infrastructure is scalable, maintainable, and provides the insights you need.
Structured Logging
Use structured logging formats like JSON instead of plain text. Structured logs are easier to parse, filter, and analyze. Most modern logging libraries support structured logging out of the box.
Example of structured logging in a Node.js application:
const winston = require('winston');
const logger = winston.createLogger({ format: winston.format.json(), transports: [ new winston.transports.Console({ format: winston.format.combine( winston.format.timestamp(), winston.format.json() ) }) ] });
logger.info('User login attempt', { userId: '12345', ipAddress: '192.168.1.100', userAgent: 'Mozilla/5.0...', success: true });
Log Level Management
Implement proper log levels (DEBUG, INFO, WARN, ERROR) and configure your applications to use appropriate levels for different environments. In production, you might want to set the minimum level to INFO or WARN to reduce log volume.
Resource Management
Configure resource limits for your logging components to prevent them from consuming too much CPU or memory. This is especially important for node-level logging agents that run on every node.
resources:
requests:
memory: "64Mi"
cpu: "50m"
limits:
memory: "128Mi"
cpu: "100m"
Log Retention and Archival
Implement log retention policies to manage storage costs and comply with data retention requirements. Elasticsearch provides index lifecycle management (ILM) policies that can automatically manage log retention.
Monitoring and Alerting
Effective logging isn't just about collecting logs—it's about using them to monitor your applications and infrastructure. Set up dashboards in Kibana to visualize key metrics and create alerts for critical issues.
Common monitoring scenarios include:
- Error rate monitoring: Track application error rates and alert when they exceed thresholds
- Performance monitoring: Monitor response times and throughput
- Security monitoring: Detect suspicious activities and failed authentication attempts
- Resource monitoring: Track CPU, memory, and disk usage
Integration with Logit.io
While the ELK Stack provides powerful logging capabilities, managing and maintaining it can be complex and resource-intensive. Logit.io offers a managed ELK Stack solution that handles the infrastructure management while providing all the benefits of the ELK Stack.
Logit.io's platform includes:
- Managed Elasticsearch: Fully managed and optimized Elasticsearch clusters
- Advanced Log Processing: Built-in Logstash pipelines for common use cases
- Custom Dashboards: Pre-built Kibana dashboards for Kubernetes monitoring
- Security and Compliance: SOC 2, ISO 27001, GDPR, and HIPAA compliance
- Global Infrastructure: Multiple data centers for regional compliance
To get started with Logit.io for your Kubernetes logging needs, you can sign up for a free trial and begin collecting logs from your Kubernetes cluster within minutes.
Conclusion
Effective Kubernetes logging with the ELK Stack requires careful planning and implementation of best practices. By following the strategies outlined in this guide, you can build a robust logging infrastructure that provides the visibility you need to maintain healthy, performant applications.
Remember that logging is not a one-time setup—it's an ongoing process that requires monitoring, optimization, and maintenance. Start with the basics, implement structured logging, and gradually add more sophisticated features as your needs grow.
Whether you choose to manage your own ELK Stack or use a managed solution like Logit.io, the key is to ensure your logging infrastructure scales with your applications and provides the insights you need to make informed decisions about your Kubernetes environment.