Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opentelemetry] add templating for ceph and kvm #557

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions opentelemetry/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,13 @@ The package will deploy the OpenTelemetry Operator which works as a manager for
- Journald events from systemd journal
- its own metrics

You can disable the collection of logs by setting `open_telemetry.LogCollector.enabled` to `false`. The same is true for disabling metrics: `open_telemetry.MetricsCollector.enabled` to `false`.
You can disable the collection of logs by setting `openTelemetry.logCollector.enabled` to `false`. The same is true for disabling the collection of metrics by setting `openTelemetry.metricsCollector.enabled` to `false`.
The `logsCollector` comes with a standard set of log-processing, such as adding cluster information and common labels for Journald events.
In addition we provide default pipelines for common log types. Currently the following log types have default configurations that can be enabled (requires `logsCollector.enabled` to `true`):
1. KVM: `openTelemetry.logsCollector.kvmConfig`: Logs from Kernel-based Virtual Machines (KVMs) providing insights into virtualization activities, resource usage, and system performance
2. Ceph:`openTelemetry.logsCollector.cephConfig`: Logs from Ceph storage systems, capturing information about cluster operations, performance metrics, and health status

These default configurations provide common labels and Grok parsing for logs emitted through the respective services.

Based on the backend selection the telemetry data will be exporter to the backend.

Expand All @@ -67,8 +73,10 @@ Greenhouse regularly performs integration tests that are bundled with **OpenTele

| Name | Description | Type | required |
| ------------ | -------------------- |---------------- | ------------------ |
`openTelemetry.logsCollector.enabled` | Activates the standard configuration for logs | bool | `false`
`openTelemetry.metricsCollector.enabled` | Activates the standard configuration for metrics | bool | `false`
`openTelemetry.logsCollector.enabled` | Activates the standard configuration for logs | bool | `false` |
`openTelemetry.logsCollector.kvmConfig.enabled` | Activates the configuration for KVM logs (requires logsCollector to be enabled) | bool | `false` |
`openTelemetry.logsCollector.cephConfig.enabled` | Activates the configuration for Ceph logs (requires logsCollector to be enabled) | bool | `false` |
`openTelemetry.metricsCollector.enabled` | Activates the standard configuration for metrics | bool | `false` |
`openTelemetry.openSearchLogs.username` | Username for OpenSearch endpoint | secret | `false` |
`openTelemetry.openSearchLogs.password` | Password for OpenSearch endpoint | secret | `false` |
`openTelemetry.openSearchLogs.endpoint` | Endpoint URL for OpenSearch | secret | `false` |
Expand Down
2 changes: 1 addition & 1 deletion opentelemetry/chart/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
apiVersion: v2
appVersion: v0.114.0
name: opentelemetry-operator
version: 0.6.1
version: 0.7.0
description: OpenTelemetry Operator Helm chart for Kubernetes
icon: https://raw.githubusercontent.com/cncf/artwork/a718fa97fffec1b9fd14147682e9e3ac0c8817cb/projects/opentelemetry/icon/color/opentelemetry-icon-color.png
type: application
Expand Down
94 changes: 94 additions & 0 deletions opentelemetry/chart/templates/_ceph-config.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
{{- define "ceph.labels" }}
- tag_name: app.label.component
key: app.kubernetes.io/component
from: pod
- tag_name: app.label.created-by
key: app.kubernetes.io/created-by
from: pod
- tag_name: app.label.managed-by
key: app.kubernetes.io/managed-by
from: pod
- tag_name: app.label.part-of
key: app.kubernetes.io/part-of
from: pod
- tag_name: app.label.ceph-osd-id
key: ceph-osd-id
from: pod
- tag_name: app.label.ceph_daemon_id
key: ceph_daemon_id
from: pod
- tag_name: app.label.ceph_daemon_type
key: ceph_daemon_type
from: pod
- tag_name: app.label.device-class
key: device-class
from: pod
- tag_name: app.label.failure-domain
key: failure-domain
from: pod
- tag_name: app.label.osd
key: osd
from: pod
- tag_name: app.label.osd-store
key: osd-store
from: pod
- tag_name: app.label.portable
key: portable
from: pod
- tag_name: app.label.rook_cluster
key: rook_cluster
from: pod
- tag_name: app.label.rook_io.operator-namespace
key: rook_io/operator-namespace
from: pod
- tag_name: app.label.topology-location-host
key: topology-location-host
from: pod
- tag_name: app.label.topology-location-region
key: topology-location-region
from: pod
- tag_name: app.label.topology-location-region
key: topology-location-region
from: pod
- tag_name: app.label.topology-location-root
key: topology-location-root
from: pod
- tag_name: app.label.topology-location-zone
key: topology-location-zone
from: pod
{{- end }}

{{- define "ceph.transform" }}
transform/ceph_rgw:
error_mode: ignore
log_statements:
- context: log
conditions:
- resource.attributes["app.label.component"] == "cephobjectstores.ceph.rook.io"
statements:
- merge_maps(attributes, ExtractGrokPatterns(body, "%{WORD:debug_level} %{TIMESTAMP_ISO8601:log_timestamp}(%{SPACE})?%{NOTSPACE}(%{SPACE})?%{NOTSPACE}(%{SPACE})?%{WORD}\\:(%{SPACE})?%{WORD}\\:(%{SPACE})?%{IP:client.address}(%{SPACE})?%{NOTSPACE}(%{SPACE})%{PROJECT_ID:project.id}(\\$%{NOTSPACE})?(%{SPACE})?\\[%{HTTPDATE:request.timestamp}\\] \"%{WORD:request.method} \\/(?<bucket>[a-zA-Z0-9._+-]+)?(\\/)?(%{NOTSPACE:request.path})? %{WORD:network.protocol.name}/%{NOTSPACE:network.protocol.version}\" %{NUMBER:response} %{NUMBER:content.length:int} %{NOTSPACE} \"%{GREEDYDATA:user.agent}\" %{NOTSPACE} latency=%{NUMBER:latency:float}", true, ["PROJECT_ID=([A-Za-z0-9-]+)"]),"upsert")
- merge_maps(attributes, ExtractGrokPatterns(body, "%{WORD:log_level}%{SPACE}%{NOTSPACE}%{SPACE}%{NOTSPACE:process.name}", true),"upsert")
- set(attributes["network.protocol.name"], ConvertCase(attributes["network.protocol.name"], "lower")) where cache["network.protocol.name"] != nil
- set(attributes["config.parsed"], "ceph_rgw") where attributes["project.id"] != nil
- set(attributes["config.parsed"], "ceph_rgw") where attributes["log_level"] != nil

transform/ceph_osd:
error_mode: ignore
log_statements:
- context: log
conditions:
- resource.attributes["app.label.component"] == "cephclusters.ceph.rook.io"
statements:
- merge_maps(attributes, ExtractGrokPatterns(body, "%{WORD:osd.stats.level}%{SPACE}%{NOTSPACE:osd.stats.files}%{SPACE}%{NUMBER:osd.stats.osd.stats.size:float}%{SPACE}%{WORD}%{SPACE}%{NUMBER:osd.stats.size_unit:float}%{SPACE}%{NUMBER:osd.stats.score:float}%{SPACE}%{NUMBER:osd.stats.read_gb:float}%{SPACE}%{NUMBER:osd.stats.rn_gb:float}%{SPACE}%{NUMBER:osd.stats.rnp1_gb:float}%{SPACE}%{NUMBER:osd.stats.write_gb:float}%{SPACE}%{NUMBER:osd.stats.wnew_gb:float}%{SPACE}%{NUMBER:osd.stats.moved:float}%{SPACE}%{NUMBER:osd.stats.w_amp:float}%{SPACE}%{NUMBER:osd.stats.rd_mb_s:float}%{SPACE}%{NUMBER:osd.stats.wr_mb_s:float}%{SPACE}%{NUMBER:osd.stats.comp_sec:float}%{SPACE}%{NUMBER:osd.stats.comp_merge_cpu_sec:float}%{SPACE}%{NUMBER:osd.stats.cpmp_cnt:float}%{SPACE}%{NUMBER:osd.stats.av_sec:float}%{SPACE}%{NUMBER:osd.stats.keyin:float}%{SPACE}%{NUMBER:osd.stats.keydrop:float}%{SPACE}%{NUMBER:osd.stats.rblob_gb:float}%{SPACE}%{NUMBER:osd.stats.wblol_gb:float}", true),"upsert")
- merge_maps(attributes, ExtractGrokPatterns(body, "%{GREEDYDATA:osd.wall.type}\\:%{SPACE}%{NUMBER:osd.wall.writes:float}(K|M|G)?%{SPACE}writes,%{SPACE}%{NUMBER:osd.wall.syncs:float}(K|M|G)?%{SPACE}syncs,%{SPACE}%{NUMBER:osd.wall.writes_per_sync:float}%{SPACE}writes per sync, written\\:%{SPACE}%{NUMBER:osd.wall.written_gb:float}%{SPACE}(GB|MB|TB)?,%{SPACE}%{NUMBER:osd.written_mb_sec:float}", true),"upsert")
- merge_maps(attributes, ExtractGrokPatterns(body, "%{WORD:log_level}%{SPACE}%{NOTSPACE}%{SPACE}%{SPACE}%{NOTSPACE:process.name}", true),"upsert")
- set(attributes["config.parsed"], "ceph_osd") where attributes["osd.stats.level"] != nil
- set(attributes["config.parsed"], "ceph_osd") where attributes["osd.wall.type"] != nil
{{- end }}

{{- define "ceph.pipeline" }}
logs/ceph:
receivers: [filelog/containerd]
processors: [k8sattributes,attributes/cluster,transform/ingress,transform/ceph_rgw,transform/ceph_osd]
exporters: [forward]
{{- end }}
64 changes: 64 additions & 0 deletions opentelemetry/chart/templates/_kvm-config.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
{{- define "kvm.labels" }}

{{- end }}
Comment on lines +1 to +3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{{- define "kvm.labels" }}
{{- end }}

Is this needed?

{{- define "kvm.receiver" }}
filelog/kvm_logs:
include: [ /var/log/libvirt/qemu/*.log, /var/log/openvswitch/*.log ]
include_file_path: true
start_at: beginning
multiline:
line_start_pattern: ^\d{4}-\d{2}-\d{2}
operators:
- type: regex_parser
regex: (?P<logtime>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d{3})
- type: time_parser
parse_from: attributes.logtime
layout: '%Y-%m-%dT%H:%M:%S.%L'
layout_type: strptime
- id: file-label
type: add
field: attributes["log.type"]
value: "files"
{{- end }}
{{- define "kvm.transform" }}
transform/kvm_openvswitch:
error_mode: ignore
log_statements:
- context: log
conditions:
- resource.attributes["k8s.daemonset.name"] == "neutron-openvswitch-agent"
statements:
- merge_maps(attributes, ExtractGrokPatterns(body, "%{TIMESTAMP_ISO8601:logtime}%{SPACE}%{NUMBER:process.id}%{SPACE}%{WORD:log_level}%{SPACE}%{NOTSPACE:process.name}%{SPACE}\\[%{REQUEST_ID:request.id}%{SPACE}%{REQUEST_ID:request.global_id}", true, ["REQUEST_ID=([A-Za-z0-9-]+)"]), "upsert")
- set(attributes["config.parsed"], "kvm_openvswitch") where attributes["log_level"] != nil

transform/kvm_nova_agent:
error_mode: ignore
log_statements:
- context: log
conditions:
- resource.attributes["k8s.daemonset.name"] == "nova-hypervisor-agents-compute-kvm"
statements:
- merge_maps(attributes, ExtractGrokPatterns(body, "%{TIMESTAMP_ISO8601:logtime}%{SPACE}%{NUMBER:process.id}%{SPACE}%{WORD:log_level}%{SPACE}%{NOTSPACE:process.name}%{SPACE}\\[%{REQUEST_ID:request.id}%{SPACE}%{REQUEST_ID:request.global_id}", true, ["REQUEST_ID=([A-Za-z0-9-]+)"]), "upsert")
- set(attributes["config.parsed"], "kvm_nova_agent") where attributes["log_level"] != nil

transform/kvm_logs:
error_mode: ignore
log_statements:
- context: log
conditions:
- resource.attributes["log.type"] == "files"
statements:
- merge_maps(attributes, ExtractGrokPatterns(body, "%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{GREEDYDATA:log}",true), "upsert")
- set(attributes["config.parsed"], "files") where attributes["log_level"] != nil

{{- end }}
{{- define "kvm.pipeline" }}
logs/kvm_containerd:
receivers: [filelog/containerd]
processors: [k8sattributes,attributes/cluster,transform/ingress,transform/kvm_openvswitch,transform/kvm_nova_agent]
exporters: [forward]
logs/kvm_filelog:
receivers: [filelog/kvm_logs]
processors: [k8sattributes,attributes/cluster,transform/kvm_logs]
exporters: [forward]
{{- end }}
Loading
Loading