Skip to content

istio k8s discovery #5854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Mar 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
1cfa1b3
Spike: test istio k8s discovery
atoulme Feb 1, 2025
bd7e504
add more metrics
atoulme Mar 19, 2025
84d9177
Merge branch 'main' into istio
jinja2 Mar 20, 2025
636d324
Spike: test istio k8s discovery
atoulme Feb 1, 2025
b7e5042
add more metrics
atoulme Mar 19, 2025
08086a3
update discover rules and config
jinja2 Mar 26, 2025
771baee
Merge branch 'istio' of github.com:signalfx/splunk-otel-collector int…
jinja2 Mar 26, 2025
e95f3d7
Merge branch 'main' into istio
jinja2 Mar 26, 2025
125f8ae
add discovery test for istio
jinja2 Mar 27, 2025
60f06ab
Merge branch 'main' into istio
jinja2 Mar 27, 2025
6ab6cde
changelog
jinja2 Mar 27, 2025
d6661ef
Merge branch 'main' into istio
jinja2 Mar 27, 2025
1cecbd0
fix test
jinja2 Mar 27, 2025
029c9aa
fix lint
jinja2 Mar 27, 2025
08ef05e
add doc
jinja2 Mar 27, 2025
4b9565f
Merge branch 'main' into istio
jinja2 Mar 27, 2025
350050c
fix test
jinja2 Mar 27, 2025
99bdbea
Merge branch 'main' into istio
jinja2 Mar 27, 2025
fdd4157
multiline the config and check more attrs
jinja2 Mar 28, 2025
77dc0d6
Merge branch 'main' into istio
jinja2 Mar 28, 2025
7172616
add metrics to keep
jinja2 Mar 28, 2025
5de7670
update metric for envoy to a more common one
jinja2 Mar 28, 2025
96733df
Merge branch 'main' into istio
jinja2 Mar 28, 2025
5f9b0c8
Merge branch 'main' into istio
jinja2 Mar 31, 2025
701b3ac
disable prometheus/istio receiver
jinja2 Mar 31, 2025
d75f696
Merge branch 'main' into istio
jinja2 Mar 31, 2025
c19856e
update changelog
jinja2 Mar 31, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions .github/workflows/configs/kind-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
kubeadmConfigPatches:
- |
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "ingress-ready=true"
- |
kind: KubeletConfiguration
serverTLSBootstrap: true
extraPortMappings:
- containerPort: 80
hostPort: 80
protocol: TCP
- containerPort: 443
hostPort: 443
protocol: TCP
15 changes: 11 additions & 4 deletions .github/workflows/integration-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -399,7 +399,7 @@ jobs:
id: get-matrix-k8s
run: |
includes=""
for service in "envoy"; do
for service in "envoy" "istio"; do
for arch in "amd64"; do
includes="${includes},{\"SERVICE\": \"${service}\", \"ARCH\": \"${arch}\"}"
done
Expand All @@ -425,9 +425,13 @@ jobs:
node_image: kindest/node:v1.30.0
kubectl_version: v1.30.0
cluster_name: kind
config: ./.github/workflows/configs/kind-config.yaml
- name: Deploy service under test
if: ${{ matrix.SERVICE != 'istio' }}
run: |
kubectl apply -f k8s/${{ matrix.SERVICE }}/*.yaml
for f in k8s/${{ matrix.SERVICE }}/*.sh; do
bash "$f"
done
- uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
Expand All @@ -436,6 +440,9 @@ jobs:
with:
name: docker-otelcol-${{ matrix.ARCH }}
path: ./docker-otelcol/${{ matrix.ARCH }}
- name: Fix kubelet TLS server certificates
run: |
kubectl get csr -o=jsonpath='{range.items[?(@.spec.signerName=="kubernetes.io/kubelet-serving")]}{.metadata.name}{" "}{end}' | xargs kubectl certificate approve
- run: docker load -i ./docker-otelcol/${{ matrix.ARCH }}/image.tar
- name: Load Docker image in kind
run: |
Expand All @@ -445,5 +452,5 @@ jobs:
- name: Print logs
if: failure()
run: |
kubectl get pods
kubectl logs $(kubectl get pod -l app=otelcol -o jsonpath="{.items[0].metadata.name}")
kubectl get pods -A
kubectl get pod -A -l app=otelcol -o jsonpath="{range .items[*]}{.metadata.namespace} {.metadata.name}{'\n'}{end}" | xargs -r -n2 sh -c 'kubectl logs -n $0 $1'
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -59,3 +59,6 @@ tests/installation/testdata/systemd/splunk-otel-collector.conf

# For convenience excluding sarif files generated by ./.github/workflows/scripts/govulncheck-run.sh
/govulncheck/

# temp istio installation files
tests/receivers/istio/istio-*/
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ releases where appropriate.

### 💡 Enhancements 💡

- (Splunk) Add a new discovery bundle for Istio metrics which includes proxy, gateway, and pilot/istiod ([#5854](https://github.com/signalfx/splunk-otel-collector/pull/5854))
- This discovery receiver, named prometheus/istio, is disabled by default. Users can enable it by setting the discovery config `splunk.discovery.receivers.prometheus/istio.enabled=true`.
- (Splunk) Update `splunk-otel-javaagent` to `v2.14.0` ([#6000](https://github.com/signalfx/splunk-otel-collector/pull/6000))
- (Splunk) Update `jmx-metric-gatherer` to `v1.45.0` ([#5995](https://github.com/signalfx/splunk-otel-collector/pull/5995))
- (Splunk) Use direct connection for MongoDB discovery ([#6042](https://github.com/signalfx/splunk-otel-collector/pull/6042))
Expand Down
4 changes: 4 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,10 @@ smartagent-integration-test:
integration-test-envoy-discovery-k8s:
@set -e; cd tests && $(GOTEST_SERIAL) $(BUILD_INFO_TESTS) --tags=discovery_integration_envoy_k8s -v -timeout 5m -count 1 ./...

.PHONY: integration-test-istio-discovery-k8s
integration-test-istio-discovery-k8s:
@set -e; cd tests && $(GOTEST_SERIAL) $(BUILD_INFO_TESTS) --tags=discovery_integration_istio_k8s -v -timeout 15m -count 1 ./...

.PHONY: gotest-with-codecov
gotest-with-cover:
@$(MAKE) for-all-target TARGET="test-with-codecov"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
#####################################################################################
# This file is generated by the Splunk Distribution of the OpenTelemetry Collector. #
# #
# It reflects the default configuration bundled in the Collector executable for use #
# in discovery mode (--discovery) and is provided for reference or customization. #
# Please note that any changes made to this file will need to be reconciled during #
# upgrades of the Collector. #
#####################################################################################
# prometheus/istio:
# enabled: false
# rule:
# k8s_observer: type == "pod" and ("istio.io/rev" in annotations or labels["istio"] == "pilot" or name matches "istio.*")
# config:
# default:
# config:
# scrape_configs:
# - job_name: 'istio'
# metrics_path: '`"prometheus.io/path" in annotations ? annotations["prometheus.io/path"] : "/metrics"`'
# scrape_interval: 10s
# static_configs:
# - targets: ['`endpoint`:`"prometheus.io/port" in annotations ? annotations["prometheus.io/port"] : 15090`']
# metric_relabel_configs:
# - source_labels: [__name__]
# action: keep
# regex: "(envoy_cluster_lb_healthy_panic|\
# envoy_cluster_manager_warming_clusters|\
# envoy_cluster_membership_healthy|\
# envoy_cluster_membership_total|\
# envoy_cluster_ssl_handshake|\
# envoy_cluster_ssl_session_reused|\
# envoy_cluster_ssl_versions_TLSv1_2|\
# envoy_cluster_ssl_versions_TLSv1_3|\
# envoy_cluster_upstream_cx_active|\
# envoy_cluster_upstream_cx_close_notify|\
# envoy_cluster_upstream_cx_connect_attempts_exceeded|\
# envoy_cluster_upstream_cx_connect_ms_sum|\
# envoy_cluster_upstream_cx_connect_timeout|\
# envoy_cluster_upstream_cx_destroy_local_with_active_rq|\
# envoy_cluster_upstream_cx_http1_total|\
# envoy_cluster_upstream_cx_http2_total|\
# envoy_cluster_upstream_cx_idle_timeout|\
# envoy_cluster_upstream_cx_max_requests|\
# envoy_cluster_upstream_cx_none_healthy|\
# envoy_cluster_upstream_cx_pool_overflow|\
# envoy_cluster_upstream_cx_protocol_error|\
# envoy_cluster_upstream_cx_total|\
# envoy_cluster_upstream_rq_4xx|\
# envoy_cluster_upstream_rq_5xx|\
# envoy_cluster_upstream_rq_active|\
# envoy_cluster_upstream_rq_cancelled|\
# envoy_cluster_upstream_rq_completed|\
# envoy_cluster_upstream_rq_pending_active|\
# envoy_cluster_upstream_rq_retry|\
# envoy_cluster_upstream_rq_retry_limit_exceeded|\
# envoy_cluster_upstream_rq_timeout|\
# envoy_cluster_upstream_rq_tx_reset|\
# envoy_cluster_upstream_rq_time|\
# envoy_cluster_upstream_rq_xx|\
# envoy_listener_downstream_cx_total|\
# envoy_listener_ssl_versions_TLSv1_2|\
# envoy_listener_ssl_versions_TLSv1_3|\
# envoy_server_live|\
# envoy_server_memory_allocated|\
# envoy_server_memory_heap_size|\
# envoy_server_total_connections|\
# envoy_server_uptime|\
# istio_mesh_connections_from_logs|\
# istio_monitor_pods_without_sidecars|\
# istio_request_bytes|\
# istio_request_duration_milliseconds|\
# istio_request_messages_total|\
# istio_requests_total|\
# istio_response_messages_total|\
# istio_tcp_connections_closed_total|\
# istio_tcp_connections_opened_total|\
# istio_tcp_received_bytes_total|\
# istio_tcp_response_bytes_total|\
# pilot_conflict_inbound_listener|\
# pilot_eds_no_instances|\
# pilot_k8s_cfg_events|\
# pilot_k8s_endpoints_pending_pod|\
# pilot_k8s_endpoints_with_no_pods|\
# pilot_no_ip|\
# pilot_proxy_convergence_time|\
# pilot_proxy_queue_time|\
# pilot_services|\
# pilot_xds_cds_reject|\
# pilot_xds_eds_reject|\
# pilot_xds_expired_nonce|\
# pilot_xds_lds_reject|\
# pilot_xds_push_context_errors|\
# pilot_xds_push_time|\
# pilot_xds_rds_reject|\
# pilot_xds_send_time|\
# pilot_xds_write_timeout)(?:_sum|_count|_bucket)?"
# status:
# metrics:
# - status: successful
# strict: envoy_server_uptime
# message: istio prometheus receiver is working for istio-proxy!
# - status: successful
# strict: pilot_services
# message: istio prometheus receiver is working for istiod!
# statements:
# - status: failed
# regexp: "connection refused"
# message: The container is not serving http connections.
# - status: failed
# regexp: "dial tcp: lookup"
# message: Unable to resolve istio prometheus tcp endpoint
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#####################################################################################
# Do not edit manually! #
# All changes must be made to associated .tmpl file before running 'make bundle.d'. #
#####################################################################################
prometheus/istio:
enabled: false
rule:
k8s_observer: type == "pod" and ("istio.io/rev" in annotations or labels["istio"] == "pilot" or name matches "istio.*")
config:
default:
config:
scrape_configs:
- job_name: 'istio'
metrics_path: '`"prometheus.io/path" in annotations ? annotations["prometheus.io/path"] : "/metrics"`'
scrape_interval: 10s
static_configs:
- targets: ['`endpoint`:`"prometheus.io/port" in annotations ? annotations["prometheus.io/port"] : 15090`']
metric_relabel_configs:
- source_labels: [__name__]
action: keep
regex: "(envoy_cluster_lb_healthy_panic|\
envoy_cluster_manager_warming_clusters|\
envoy_cluster_membership_healthy|\
envoy_cluster_membership_total|\
envoy_cluster_ssl_handshake|\
envoy_cluster_ssl_session_reused|\
envoy_cluster_ssl_versions_TLSv1_2|\
envoy_cluster_ssl_versions_TLSv1_3|\
envoy_cluster_upstream_cx_active|\
envoy_cluster_upstream_cx_close_notify|\
envoy_cluster_upstream_cx_connect_attempts_exceeded|\
envoy_cluster_upstream_cx_connect_ms_sum|\
envoy_cluster_upstream_cx_connect_timeout|\
envoy_cluster_upstream_cx_destroy_local_with_active_rq|\
envoy_cluster_upstream_cx_http1_total|\
envoy_cluster_upstream_cx_http2_total|\
envoy_cluster_upstream_cx_idle_timeout|\
envoy_cluster_upstream_cx_max_requests|\
envoy_cluster_upstream_cx_none_healthy|\
envoy_cluster_upstream_cx_pool_overflow|\
envoy_cluster_upstream_cx_protocol_error|\
envoy_cluster_upstream_cx_total|\
envoy_cluster_upstream_rq_4xx|\
envoy_cluster_upstream_rq_5xx|\
envoy_cluster_upstream_rq_active|\
envoy_cluster_upstream_rq_cancelled|\
envoy_cluster_upstream_rq_completed|\
envoy_cluster_upstream_rq_pending_active|\
envoy_cluster_upstream_rq_retry|\
envoy_cluster_upstream_rq_retry_limit_exceeded|\
envoy_cluster_upstream_rq_timeout|\
envoy_cluster_upstream_rq_tx_reset|\
envoy_cluster_upstream_rq_time|\
envoy_cluster_upstream_rq_xx|\
envoy_listener_downstream_cx_total|\
envoy_listener_ssl_versions_TLSv1_2|\
envoy_listener_ssl_versions_TLSv1_3|\
envoy_server_live|\
envoy_server_memory_allocated|\
envoy_server_memory_heap_size|\
envoy_server_total_connections|\
envoy_server_uptime|\
istio_mesh_connections_from_logs|\
istio_monitor_pods_without_sidecars|\
istio_request_bytes|\
istio_request_duration_milliseconds|\
istio_request_messages_total|\
istio_requests_total|\
istio_response_messages_total|\
istio_tcp_connections_closed_total|\
istio_tcp_connections_opened_total|\
istio_tcp_received_bytes_total|\
istio_tcp_response_bytes_total|\
pilot_conflict_inbound_listener|\
pilot_eds_no_instances|\
pilot_k8s_cfg_events|\
pilot_k8s_endpoints_pending_pod|\
pilot_k8s_endpoints_with_no_pods|\
pilot_no_ip|\
pilot_proxy_convergence_time|\
pilot_proxy_queue_time|\
pilot_services|\
pilot_xds_cds_reject|\
pilot_xds_eds_reject|\
pilot_xds_expired_nonce|\
pilot_xds_lds_reject|\
pilot_xds_push_context_errors|\
pilot_xds_push_time|\
pilot_xds_rds_reject|\
pilot_xds_send_time|\
pilot_xds_write_timeout)(?:_sum|_count|_bucket)?"
status:
metrics:
- status: successful
strict: envoy_server_uptime
message: istio prometheus receiver is working for istio-proxy!
- status: successful
strict: pilot_services
message: istio prometheus receiver is working for istiod!
statements:
- status: failed
regexp: "connection refused"
message: The container is not serving http connections.
- status: failed
regexp: "dial tcp: lookup"
message: Unable to resolve istio prometheus tcp endpoint
Loading
Loading