-
Notifications
You must be signed in to change notification settings - Fork 83
Otel collector integration #519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Mateusz seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds OpenTelemetry Collector integration to the Netdata Helm chart, enabling the collection of logs from Kubernetes pods and forwarding them to a netdata instance via OTLP protocol.
Changes:
- Added
netdataOpentelemetrydeployment and service for receiving OTLP data - Integrated
opentelemetry-collectoras a subchart dependency for collecting Kubernetes logs - Fixed YAML indentation issues in ingress hosts and child tolerations configurations
Reviewed changes
Copilot reviewed 8 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| charts/netdata/values.yaml | Added netdataOpentelemetry configuration section and otel-collector subchart settings; fixed indentation issues |
| charts/netdata/templates/netdata-otel/service.yaml | Created service to expose OTLP endpoint on port 4317 |
| charts/netdata/templates/netdata-otel/secrets.yaml | Created secrets template for OTEL configuration |
| charts/netdata/templates/netdata-otel/persistentvolumeclaim.yaml | Created PVC for OTEL log storage |
| charts/netdata/templates/netdata-otel/deployment.yaml | Created deployment for netdata instance with OTEL plugin enabled |
| charts/netdata/templates/netdata-otel/configmap.yaml | Created configmap template for OTEL configuration |
| charts/netdata/templates/_helpers.tpl | Added helper templates for netdataOpentelemetry configs management |
| charts/netdata/Chart.yaml | Added opentelemetry-collector subchart dependency |
| charts/netdata/Chart.lock | Generated lock file for subchart dependency |
| charts/netdata/charts/opentelemetry-collector-0.115.0.tgz | Added OpenTelemetry Collector subchart package |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
charts/netdata/values.yaml
Outdated
| hostname: "netdata-otel" | ||
| port: "{{ .Values.parent.port }}" | ||
|
|
||
| ## Service responsible for receiving OTEL logs |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment describes the service as "Service responsible for receiving OTEL logs", but the netdata OTEL plugin configuration shows it can handle both metrics and logs (lines 652-664). Consider updating the comment to "Service responsible for receiving OTEL metrics and logs" for accuracy.
| ## Service responsible for receiving OTEL logs | |
| ## Service responsible for receiving OTEL metrics and logs |
|
@cubic-dev-ai review this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 8 out of 10 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| {{- end }} | ||
| spec: | ||
| securityContext: | ||
| fsGroup: 201 |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The securityContext is set to fsGroup 201, but unlike the parent deployment which also sets runAsUser and runAsGroup to 201, this deployment doesn't specify these fields. For consistency and security best practices, consider adding runAsUser and runAsGroup to match the pattern used in the parent deployment, unless there's a specific reason to omit them.
| fsGroup: 201 | |
| fsGroup: 201 | |
| runAsUser: 201 | |
| runAsGroup: 201 |
charts/netdata/values.yaml
Outdated
|
|
||
| podAnnotations: {} | ||
|
|
||
| dnsPolicy: ClusterFirstWithHostNet |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dnsPolicy is set to 'ClusterFirstWithHostNet', but this deployment doesn't set hostNetwork to true (unlike the child deployment which does). The 'ClusterFirstWithHostNet' policy is specifically designed for pods using hostNetwork. Since this deployment doesn't use hostNetwork, consider changing this to 'ClusterFirst' or 'Default' to match the parent deployment pattern which uses 'Default'.
| dnsPolicy: ClusterFirstWithHostNet | |
| dnsPolicy: Default |
witalisoft
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sure that otel is also enabled when we are running checks
| - name: varlib | ||
| mountPath: /var/log/netdata/otel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
varlib but mounted under /var/log, I would choose the /var/lib
| netdataOpentelemetry: | ||
| enabled: false | ||
| hostname: "netdata-otel" | ||
| port: "{{ .Values.parent.port }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parent port ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, same as with the k8s-state or child. This is not the service port.
c268d60 to
f065b9d
Compare
f065b9d to
ac92e27
Compare
5c3757b to
4b1dbb2
Compare
|
|
||
| The following table lists the configurable parameters of the netdata chart and their default values. | ||
|
|
||
| {{ template "chart.valuesTableHtml" . }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTML because it renders better the multiline comments and values + adds some extra feats like copy, colors. Any markdown parser will render it no problem anyway.
| - name: Run chart-testing (install) | ||
| run: ct install --target-branch ${{ github.event.repository.default_branch }} | ||
|
|
||
| - name: Run chart-testing (install with OpenTelemetry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an extra for easier problem catching, in theory we can just use this one and delete the regular install. Good thing is that at the end of the ct install there is a cleanup so they are both independent.
|
Last thing left for now is to make docs a bit more readable, after that it should be ready for review |
c4cc83f to
73258ee
Compare
73258ee to
138f84e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 12 out of 14 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| resources: | ||
| {{ toYaml .Values.sysctlInitContainer.resources | indent 12 }} | ||
| {{- end }} | ||
| containers: |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The netdata-otel deployment is missing both pod-level and container-level securityContext configurations that are present in the parent deployment (parent/deployment.yaml lines 37-38, 121-123). This creates an inconsistency in security posture across the deployments. The parent deployment sets fsGroup, runAsUser, and runAsGroup, which should likely be applied to the netdata-otel deployment as well for consistency and proper file permissions on mounted volumes.
| # @section -- Child | ||
| failureThreshold: 3 | ||
| # -- How often (in seconds) to perform the liveness probe | ||
| # @section -- Child1.0 |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The section annotation should be "Child" instead of "Child1.0". This creates an incorrect documentation section "Child1.0" in the generated README.md (line 1315) instead of being grouped with other Child configuration options.
| # @section -- Child1.0 | |
| # @section -- Child |
| ## Only to be used with type LoadBalancer | ||
| # loadBalancerIP: 10.0.1.69 | ||
| # loadBalancerSourceRanges: [] | ||
| # externalTrafficPolicy: Local | ||
| ## Only to be used with type LoadBalancer and external traffic policy Local | ||
| # healthCheckNodePort: | ||
| ## Only to be used with type ClusterIP | ||
| # clusterIP: 10.1.2.3 |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The netdataOpentelemetry.service configuration is missing structured definitions for loadBalancerIP, loadBalancerSourceRanges, externalTrafficPolicy, healthCheckNodePort, and clusterIP fields. These are commented in lines 874-881 but not defined with proper defaults and documentation comments like the parent service configuration (lines 97-111). The service template (service.yaml) references these fields, but they won't be documented or have proper defaults if users need them.
| ## Only to be used with type LoadBalancer | |
| # loadBalancerIP: 10.0.1.69 | |
| # loadBalancerSourceRanges: [] | |
| # externalTrafficPolicy: Local | |
| ## Only to be used with type LoadBalancer and external traffic policy Local | |
| # healthCheckNodePort: | |
| ## Only to be used with type ClusterIP | |
| # clusterIP: 10.1.2.3 | |
| # -- LoadBalancer IP address (only used with service.type LoadBalancer) | |
| # @section -- Netdata OpenTelemetry | |
| loadBalancerIP: "" | |
| # -- Allowed source ranges for LoadBalancer (only used with service.type LoadBalancer) | |
| # @section -- Netdata OpenTelemetry | |
| loadBalancerSourceRanges: [] | |
| # -- External traffic policy (only used with service.type LoadBalancer) | |
| # @section -- Netdata OpenTelemetry | |
| externalTrafficPolicy: "" | |
| # -- Health check node port (only used with service.type LoadBalancer and external traffic policy Local) | |
| # @section -- Netdata OpenTelemetry | |
| healthCheckNodePort: "" | |
| # -- Cluster IP address (only used with service.type ClusterIP) | |
| # @section -- Netdata OpenTelemetry | |
| clusterIP: "" |
| - name: Run chart-testing (install) | ||
| run: ct install --target-branch ${{ github.event.repository.default_branch }} | ||
|
|
||
| - name: Run chart-testing (install with OpenTelemetry) |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new test step "Run chart-testing (install with OpenTelemetry)" is missing the conditional check that the standard install step has (line 56: if: steps.list-changed.outputs.changed == 'true'). This means the OpenTelemetry test will run even when no charts have changed, which wastes CI resources and could lead to unexpected failures. Add the same condition to this step.
| - name: Run chart-testing (install with OpenTelemetry) | |
| - name: Run chart-testing (install with OpenTelemetry) | |
| if: steps.list-changed.outputs.changed == 'true' |
Adding ability to get otel metrics from the kubernetes