You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -554,23 +554,127 @@ This method allows you to use a certificate that is trusted by external systems,
554
554
555
555
For more advanced use cases, refer to the [official Helm chart documentation](https://github.com/open-telemetry/opentelemetry-helm-charts/blob/main/charts/opentelemetry-operator/values.yaml) for detailed configuration options and scenarios.
556
556
557
-
### Troubleshooting the Operator and Cert Manager
558
-
559
-
#### Check the logs for failures
560
-
561
-
**Operator Logs:**
562
-
563
-
```bash
564
-
kubectl logs -l app.kubernetes.io/name=operator
565
-
```
566
-
567
-
**Cert-Manager Logs:**
568
-
569
-
```bash
570
-
kubectl logs -l app=certmanager
571
-
kubectl logs -l app=cainjector
572
-
kubectl logs -l app=webhook
573
-
```
557
+
### Troubleshooting the Operator
558
+
559
+
#### General Debugging Steps
560
+
In the following steps, the "operator namespace" refers to the namespace where the operator is deployed,
561
+
which is the same namespace as the chart. The "API server namespace" usually defaults to `kube-system`,
562
+
but this may vary depending on your Kubernetes distribution. If a namespace parameter is not explicitly
563
+
provided, assume it refers to the operator or chart's namespace.
564
+
565
+
- Check the logs for the operator to identify any issues:
566
+
```bash
567
+
kubectl logs -l app.kubernetes.io/name=operator
568
+
```
569
+
- The operator webhooks must communicate with the Kubernetes API server. Errors related to webhook usage can often be found in the API server logs:
- The `Service` and `Endpoints` for the Operator webhook.
631
+
- Network policies in the Operator's namespace.
632
+
633
+
### Known Issues
634
+
635
+
**Custom Network Policies or Security Layers**
636
+
- **Cause:** Tools like Calico, Cilium, or custom firewalls may block communication between the API
637
+
server and the operator webhook.
638
+
- **Resolution:**
639
+
- Before reaching out to Splunk Support, consult with your infrastructure or platform
640
+
team who set up your cluster. They may have implemented custom network policies or security layers
641
+
that could be affecting communication.
642
+
- If you are using networking or security solutions from a third-party Kubernetes solution provider,
643
+
be aware that these may include configurations or custom CRDs that can impact this operator's
644
+
functionality. Since these configurations vary widely per provider, we cannot provide specific
645
+
guidance for every product here. We recommend reviewing the providers configurations, CRD definitions,
646
+
and deployed CRD instances in your cluster to identify any settings related to networking or
647
+
security that might interfere with communication between the operator and the Kubernetes API server.
648
+
```bash
649
+
kubectl get crds
650
+
kubectl get <crd-name> --all-namespaces
651
+
kubectl get <crd-name> -n <namespace> -o yaml
652
+
```
653
+
654
+
**[EKS/Cilium] API Server Error: "No endpoints available for service 'splunk-otel-collector-operator-webhook'"**
655
+
- **Cause:** This is a general known issue in setups where the Kubernetes control plane cannot communicate
656
+
with admission webhooks, such as the operator's webhook, in other namespaces. This occurs because
657
+
the customer has deployed a custom networking solution (e.g., Cilium in overlay mode) that restricts
658
+
the expected communication between the control plane and webhooks that are not a part of the control
659
+
plane. The issue is not caused by the operator itself but by the limitations of the custom networking configuration.
660
+
- **Resolution:**
661
+
- **Solution 1: Enable ENI Mode in Cilium**
662
+
- Update the AWS Cilium setup to use ENI mode. This configuration allows the control plane to communicate
663
+
with webhooks in other namespaces. Refer to the [Cilium ENI Documentation](https://docs.cilium.io/en/stable/gettingstarted/eni/).
664
+
- **Solution 2: Run the Operator in Host Network Mode**
665
+
- Modify the `splunk-otel-collector-chart` Helm chart values to enable host network mode for the operator:
666
+
```yaml
667
+
operator:
668
+
hostNetwork: true
669
+
```
670
+
- Apply the updated Helm chart configuration and redeploy the operator.
671
+
- **Note:** While this workaround resolves the issue, running the operator in host network mode is
672
+
considered a less secure practice and thus the 1st solution would be more favorable for security.
673
+
674
+
- **Related Links:**
675
+
- [Cilium Issue #21959 How to use an admission webhook with Cilium?](https://github.com/cilium/cilium/issues/21959)
676
+
- [OpenTelemetry Operator Issue #2260 Webhook "address is not allowed" when creating an Instrumentation on EKS](https://github.com/open-telemetry/opentelemetry-operator/issues/2260)
677
+
- [Cilium Issue #30111 EKS Cilium in Overlay with ALB and webhooks: Address is not allowed](https://github.com/cilium/cilium/issues/30111)
0 commit comments