Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add otel! #36

Merged
merged 14 commits into from
Oct 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions charts/operator-wandb/Chart.lock
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ dependencies:
version: 0.1.0
- name: redis
repository: https://charts.bitnami.com/bitnami
version: 18.1.0
digest: sha256:0e062062405e017968fb5ad0e5064936cb55e2b441ddb1c2048f34eaf6de11a8
generated: "2023-09-27T12:33:43.680199603-04:00"
version: 18.1.5
- name: otel
repository: file://charts/otel
version: 0.1.0
digest: sha256:d6f7dbed1f8fcbbd34d18b0911891fb27eeef0021092b69ea35e7ca5dcede038
generated: "2023-10-16T19:07:10.090393-04:00"
6 changes: 5 additions & 1 deletion charts/operator-wandb/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: v2
name: operator-wandb
description: A Helm chart for deploying W&B to Kubernetes
type: application
version: 0.10.24
version: 0.10.25
appVersion: 1.0.0
icon: https://wandb.ai/logo.svg

Expand Down Expand Up @@ -40,3 +40,7 @@ dependencies:
version: "18.*.*"
condition: redis.install
repository: https://charts.bitnami.com/bitnami
- name: otel
version: "*.*.*"
repository: file://charts/otel
condition: otel.install
8 changes: 8 additions & 0 deletions charts/operator-wandb/charts/app/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,9 @@ spec:
- name: prometheus
containerPort: 8181
protocol: TCP
- name: gorilla-statsd
containerPort: 8125
protocol: TCP
env:
- name: HOST
value: "{{ .Values.global.host }}"
Expand Down Expand Up @@ -153,6 +156,11 @@ spec:

- name: GORILLA_SESSION_LENGTH
value: "{{ .Values.global.auth.sessionLengthHours }}h"

- name: GORILLA_STATSD_PORT
value: "8125"
- name: GORILLA_STATSD_HOST
value: "0.0.0.0"

- name: BUCKET
value: "{{ include "app.bucket" . }}"
Expand Down
7 changes: 4 additions & 3 deletions charts/operator-wandb/charts/app/templates/service.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
{{- if .Values.enabled }}
apiVersion: v1
kind: Service
metadata:
Expand All @@ -23,6 +22,8 @@ spec:
- port: 8181
protocol: TCP
name: prometheus
- port: 8125
protocol: UDP
name: gorilla-statsd
selector:
{{- include "app.labels" . | nindent 4 }}
{{- end }}
{{- include "app.labels" . | nindent 4 }}
8 changes: 4 additions & 4 deletions charts/operator-wandb/charts/console/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ resources:
# specify resources, uncomment the following lines, adjust them as necessary,
# and remove the curly braces after 'resources:'.
requests:
cpu: 500m
memory: 1Gi
cpu: 200m
memory: 200Mi
limits:
cpu: 4000m
memory: 8Gi
cpu: 1
memory: 500Mi
23 changes: 23 additions & 0 deletions charts/operator-wandb/charts/otel/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
15 changes: 15 additions & 0 deletions charts/operator-wandb/charts/otel/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
apiVersion: v2
name: otel
type: application
description: A Helm chart for Kubernetes

version: 0.1.0
appVersion: "0.33.0"

home: https://wandb.ai
icon: https://wandb.ai/logo.svg

maintainers:
- name: wandb
email: [email protected]
url: https://wandb.com
7 changes: 7 additions & 0 deletions charts/operator-wandb/charts/otel/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
We had to create a seperate chart, because the offical one does not support

1. We need to send an otlphttp to the console server. The name of this service
is dynamic. TEL helm chart does not support dynamic pipeline values
2. We could do the above as a config map, and pass it into the agent... however,
otel helm does not support using custom config maps names because they need
to be based on the release name.
93 changes: 93 additions & 0 deletions charts/operator-wandb/charts/otel/templates/_config.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
{{- define "otel.config" -}}
{{- $data := deepCopy .Values.config }}
{{- $config := .Values.config }}
{{- $config = mustMergeOverwrite (include "otel.hostMetricsReceiver" . | fromYaml) $config }}
{{- $config = mustMergeOverwrite (include "otel.logsCollectionReceiver" . | fromYaml) $config }}
{{- $config = mustMergeOverwrite (include "otel.kubeletMetricsReceiver" . | fromYaml) $config }}
{{- $config = mustMergeOverwrite (include "otel.kubernetesEventReceiver" . | fromYaml) $config }}
{{- $config = mustMergeOverwrite (include "otel.kubernetesClusterReceiver" . | fromYaml) $config }}
{{- $config = mustMergeOverwrite (include "otel.sqlQueryReceiver" . | fromYaml) $config }}
{{- $config = mustMergeOverwrite (include "otel.statsdAppReceiver" . | fromYaml) $config }}
{{- $config = mustMergeOverwrite (include "otel.extensions" . | fromYaml) $config }}
{{- $config = mustMergeOverwrite (include "otel.processors" . | fromYaml) $config }}
{{- $config = mustMergeOverwrite (include "otel.service" . | fromYaml) $config }}
{{- $config = mustMergeOverwrite (include "otel.exporter" . | fromYaml) $config }}
{{- tpl (toYaml $config) . }}
{{- end }}

{{- define "otel.exporter" -}}
exporters:
debug: {}
debug/detailed:
verbosity: detailed
prometheus:
endpoint: 0.0.0.0:9109
{{- end }}

{{- define "otel.extensions" -}}
extensions:
health_check: {}
memory_ballast:
size_in_percentage: 40
{{- end }}

{{- define "otel.processors" -}}
processors:
batch: {}
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
k8sattributes:
filter:
node_from_env_var: K8S_NODE_NAME
passthrough: false
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: connection
extract:
metadata:
- "k8s.namespace.name"
- "k8s.deployment.name"
- "k8s.statefulset.name"
- "k8s.daemonset.name"
- "k8s.cronjob.name"
- "k8s.job.name"
- "k8s.node.name"
- "k8s.pod.name"
- "k8s.pod.uid"
- "k8s.pod.start_time"
annotations:
- tag_name: $$1
key_regex: (.*)
from: pod
labels:
- tag_name: $$1
key_regex: (.*)
from: pod
{{- end }}

{{- define "otel.service" -}}
service:
extensions:
- health_check
- memory_ballast
pipelines:
metrics:
exporters: [debug, prometheus]
processors: [memory_limiter, batch, k8sattributes]
receivers: [hostmetrics, k8s_cluster, kubeletstats, sqlquery]
logs:
exporters: [debug]
processors: [memory_limiter, batch]
receivers: [filelog]
telemetry:
metrics:
address: ${env:POD_IP}:8888
{{- end }}
102 changes: 102 additions & 0 deletions charts/operator-wandb/charts/otel/templates/_helpers.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
{{/* vim: set filetype=mustache: */}}

{{/*
Expand the name of the chart.
*/}}
{{- define "otel.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "otel.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}

{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "otel.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Common labels
*/}}
{{- define "otel.labels" -}}
helm.sh/chart: {{ include "otel.chart" . }}
{{ include "otel.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
wandb.com/app-name: {{ include "otel.chart" . }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}

{{/*
Selector labels
*/}}
{{- define "otel.selectorLabels" -}}
app.kubernetes.io/name: {{ include "otel.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
Create the name of the service account to use
*/}}
{{- define "otel.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "otel.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}

{{/*
Returns the extraEnv keys and values to inject into containers.

Global values will override any chart-specific values.
*/}}
{{- define "otel.extraEnv" -}}
{{- $allExtraEnv := merge (default (dict) .local.extraEnv) .global.extraEnv -}}
{{- range $key, $value := $allExtraEnv }}
- name: {{ $key }}
value: {{ $value | quote }}
{{- end -}}
{{- end -}}

{{/*
Returns a list of _common_ labels to be shared across all
app deployments and other shared objects.
*/}}
{{- define "otel.commonLabels" -}}
{{- $commonLabels := default (dict) .Values.common.labels -}}
{{- if $commonLabels }}
{{- range $key, $value := $commonLabels }}
{{ $key }}: {{ $value | quote }}
{{- end }}
{{- end -}}
{{- end -}}

{{/*
Returns a list of _pod_ labels to be shared across all
app deployments.
*/}}
{{- define "otel.podLabels" -}}
{{- range $key, $value := .Values.pod.labels }}
{{ $key }}: {{ $value | quote }}
{{- end }}
{{- end -}}

Loading
Loading