Skip to content

Conversation

@M4itee
Copy link
Contributor

@M4itee M4itee commented Jan 23, 2026

Adding ability to get otel metrics from the kubernetes

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Mateusz seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds OpenTelemetry Collector integration to the Netdata Helm chart, enabling the collection of logs from Kubernetes pods and forwarding them to a netdata instance via OTLP protocol.

Changes:

  • Added netdataOpentelemetry deployment and service for receiving OTLP data
  • Integrated opentelemetry-collector as a subchart dependency for collecting Kubernetes logs
  • Fixed YAML indentation issues in ingress hosts and child tolerations configurations

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
charts/netdata/values.yaml Added netdataOpentelemetry configuration section and otel-collector subchart settings; fixed indentation issues
charts/netdata/templates/netdata-otel/service.yaml Created service to expose OTLP endpoint on port 4317
charts/netdata/templates/netdata-otel/secrets.yaml Created secrets template for OTEL configuration
charts/netdata/templates/netdata-otel/persistentvolumeclaim.yaml Created PVC for OTEL log storage
charts/netdata/templates/netdata-otel/deployment.yaml Created deployment for netdata instance with OTEL plugin enabled
charts/netdata/templates/netdata-otel/configmap.yaml Created configmap template for OTEL configuration
charts/netdata/templates/_helpers.tpl Added helper templates for netdataOpentelemetry configs management
charts/netdata/Chart.yaml Added opentelemetry-collector subchart dependency
charts/netdata/Chart.lock Generated lock file for subchart dependency
charts/netdata/charts/opentelemetry-collector-0.115.0.tgz Added OpenTelemetry Collector subchart package

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

hostname: "netdata-otel"
port: "{{ .Values.parent.port }}"

## Service responsible for receiving OTEL logs
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment describes the service as "Service responsible for receiving OTEL logs", but the netdata OTEL plugin configuration shows it can handle both metrics and logs (lines 652-664). Consider updating the comment to "Service responsible for receiving OTEL metrics and logs" for accuracy.

Suggested change
## Service responsible for receiving OTEL logs
## Service responsible for receiving OTEL metrics and logs

Copilot uses AI. Check for mistakes.
@ilyam8
Copy link
Member

ilyam8 commented Jan 24, 2026

@cubic-dev-ai review this PR

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 10 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

{{- end }}
spec:
securityContext:
fsGroup: 201
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The securityContext is set to fsGroup 201, but unlike the parent deployment which also sets runAsUser and runAsGroup to 201, this deployment doesn't specify these fields. For consistency and security best practices, consider adding runAsUser and runAsGroup to match the pattern used in the parent deployment, unless there's a specific reason to omit them.

Suggested change
fsGroup: 201
fsGroup: 201
runAsUser: 201
runAsGroup: 201

Copilot uses AI. Check for mistakes.

podAnnotations: {}

dnsPolicy: ClusterFirstWithHostNet
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dnsPolicy is set to 'ClusterFirstWithHostNet', but this deployment doesn't set hostNetwork to true (unlike the child deployment which does). The 'ClusterFirstWithHostNet' policy is specifically designed for pods using hostNetwork. Since this deployment doesn't use hostNetwork, consider changing this to 'ClusterFirst' or 'Default' to match the parent deployment pattern which uses 'Default'.

Suggested change
dnsPolicy: ClusterFirstWithHostNet
dnsPolicy: Default

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@witalisoft witalisoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure that otel is also enabled when we are running checks

Comment on lines 135 to 136
- name: varlib
mountPath: /var/log/netdata/otel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

varlib but mounted under /var/log, I would choose the /var/lib

netdataOpentelemetry:
enabled: false
hostname: "netdata-otel"
port: "{{ .Values.parent.port }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parent port ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, same as with the k8s-state or child. This is not the service port.

@M4itee M4itee force-pushed the otel-collector-integration branch from c268d60 to f065b9d Compare January 30, 2026 13:54
@M4itee M4itee force-pushed the otel-collector-integration branch from f065b9d to ac92e27 Compare January 30, 2026 13:59
@M4itee M4itee force-pushed the otel-collector-integration branch from 5c3757b to 4b1dbb2 Compare January 30, 2026 14:28

The following table lists the configurable parameters of the netdata chart and their default values.

{{ template "chart.valuesTableHtml" . }}
Copy link
Contributor Author

@M4itee M4itee Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTML because it renders better the multiline comments and values + adds some extra feats like copy, colors. Any markdown parser will render it no problem anyway.

- name: Run chart-testing (install)
run: ct install --target-branch ${{ github.event.repository.default_branch }}

- name: Run chart-testing (install with OpenTelemetry)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an extra for easier problem catching, in theory we can just use this one and delete the regular install. Good thing is that at the end of the ct install there is a cleanup so they are both independent.

@M4itee
Copy link
Contributor Author

M4itee commented Jan 30, 2026

Last thing left for now is to make docs a bit more readable, after that it should be ready for review

@M4itee M4itee force-pushed the otel-collector-integration branch 4 times, most recently from c4cc83f to 73258ee Compare January 30, 2026 17:13
@M4itee M4itee force-pushed the otel-collector-integration branch from 73258ee to 138f84e Compare January 30, 2026 17:16
@ilyam8 ilyam8 requested a review from Copilot January 30, 2026 18:00
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 14 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

resources:
{{ toYaml .Values.sysctlInitContainer.resources | indent 12 }}
{{- end }}
containers:
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The netdata-otel deployment is missing both pod-level and container-level securityContext configurations that are present in the parent deployment (parent/deployment.yaml lines 37-38, 121-123). This creates an inconsistency in security posture across the deployments. The parent deployment sets fsGroup, runAsUser, and runAsGroup, which should likely be applied to the netdata-otel deployment as well for consistency and proper file permissions on mounted volumes.

Copilot uses AI. Check for mistakes.
# @section -- Child
failureThreshold: 3
# -- How often (in seconds) to perform the liveness probe
# @section -- Child1.0
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section annotation should be "Child" instead of "Child1.0". This creates an incorrect documentation section "Child1.0" in the generated README.md (line 1315) instead of being grouped with other Child configuration options.

Suggested change
# @section -- Child1.0
# @section -- Child

Copilot uses AI. Check for mistakes.
Comment on lines +874 to +881
## Only to be used with type LoadBalancer
# loadBalancerIP: 10.0.1.69
# loadBalancerSourceRanges: []
# externalTrafficPolicy: Local
## Only to be used with type LoadBalancer and external traffic policy Local
# healthCheckNodePort:
## Only to be used with type ClusterIP
# clusterIP: 10.1.2.3
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The netdataOpentelemetry.service configuration is missing structured definitions for loadBalancerIP, loadBalancerSourceRanges, externalTrafficPolicy, healthCheckNodePort, and clusterIP fields. These are commented in lines 874-881 but not defined with proper defaults and documentation comments like the parent service configuration (lines 97-111). The service template (service.yaml) references these fields, but they won't be documented or have proper defaults if users need them.

Suggested change
## Only to be used with type LoadBalancer
# loadBalancerIP: 10.0.1.69
# loadBalancerSourceRanges: []
# externalTrafficPolicy: Local
## Only to be used with type LoadBalancer and external traffic policy Local
# healthCheckNodePort:
## Only to be used with type ClusterIP
# clusterIP: 10.1.2.3
# -- LoadBalancer IP address (only used with service.type LoadBalancer)
# @section -- Netdata OpenTelemetry
loadBalancerIP: ""
# -- Allowed source ranges for LoadBalancer (only used with service.type LoadBalancer)
# @section -- Netdata OpenTelemetry
loadBalancerSourceRanges: []
# -- External traffic policy (only used with service.type LoadBalancer)
# @section -- Netdata OpenTelemetry
externalTrafficPolicy: ""
# -- Health check node port (only used with service.type LoadBalancer and external traffic policy Local)
# @section -- Netdata OpenTelemetry
healthCheckNodePort: ""
# -- Cluster IP address (only used with service.type ClusterIP)
# @section -- Netdata OpenTelemetry
clusterIP: ""

Copilot uses AI. Check for mistakes.
- name: Run chart-testing (install)
run: ct install --target-branch ${{ github.event.repository.default_branch }}

- name: Run chart-testing (install with OpenTelemetry)
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new test step "Run chart-testing (install with OpenTelemetry)" is missing the conditional check that the standard install step has (line 56: if: steps.list-changed.outputs.changed == 'true'). This means the OpenTelemetry test will run even when no charts have changed, which wastes CI resources and could lead to unexpected failures. Add the same condition to this step.

Suggested change
- name: Run chart-testing (install with OpenTelemetry)
- name: Run chart-testing (install with OpenTelemetry)
if: steps.list-changed.outputs.changed == 'true'

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants