This repo showcases a telemetry data pipeline architecture that combines security validation, intelligent sampling, and cost-effective observability for high-volume mobile applications.
Mobile apps face unique observability challenges that traditional collectors don't address:
- Security Risk: Standard collectors accept telemetry from any source—leaving you vulnerable to emulator farms, device tampering, and fraudulent data pollution
- Volume & Cost: Millions of mobile devices generate massive telemetry volumes, making cloud ingestion costs spiral out of control
- Lost Context: Basic sampling can break trace-log correlation, making troubleshooting impossible
This architecture focus on solving those three problems:
- 🔒 Trust Gateway: Validates device identity and authenticity before accepting any telemetry
- 📊 Intelligent Sampling: Configurable sampling data for cost reduction while maintaining correlated traces and logs
- ⚡ Scalability: Handles millions of spans per day with predictable costs and complete observability
- 🌊 Event Streaming: Real-time telemetry streaming to Azure Event Hubs for event-driven architectures and analytics
- 🌐 Vendor Neutrality: Built on OpenTelemetry standard, avoiding vendor lock-in and seamlessly switch or add observability backends
Warning
This project is highly experimental and under active development.
This solution is designed for high-security, high-volume mobile applications, common in financial services, healthcare, or any scenario where device identity is as critical as user identity.
Core Principles:
- Every telemetry data point is cryptographically tied to a verified physical device
- Sampling decisions maintain trace-log correlation for effective debugging
- Cost control through intelligent routing: sampled data to expensive storage, full data to cost-effective destinations
- Zero-trust approach: validate first, ingest second
Note
The complete implementation of device enrollment with hardware-anchored public keys and request signature generation for proof-of-possession on the mobile app side is out of scope for this repository. This collector demonstrates the trust gateway validation pattern using dummy header validations as a simplified example of how device attestations would be verified.
- Trust Gateway Processor: Custom OpenTelemetry processor that validates device attestations through HTTP headers before accepting telemetry data
- Header Validation: Enforces presence of required headers (
X-App-Token,X-API-Key) representing device identity claims - API Key Authentication: Validates API keys against a configured whitelist to simulate device enrollment verification
- Telemetry Rejection: Automatically drops telemetry from unverified sources, preventing data pollution
- Probabilistic Sampling: Intelligent sampling strategy (10% configurable) for high-volume trace and log data
- Correlated Sampling: Traces and logs are sampled together using trace_id to maintain observability context
- Selective Metrics: Full metrics collection (no sampling) for accurate dashboards and alerting
- Multi-Pipeline Architecture: Separate pipelines for sampled data (cost-effective cloud storage) and full data (comprehensive analysis)
- Volume Management: Designed to handle millions of spans per day while controlling cloud ingestion costs
- Azure Event Hubs Exporter: Custom exporter streaming telemetry data to Azure Event Hubs
- Real-time Processing: Event Hubs enables real-time stream processing and event-driven architectures
- Parquet Format Support: Exporter supports Parquet format with Snappy compression for efficient data serialization
- Optional Data Lake Integration: Configure Event Hubs Capture for automatic archival to Azure Data Lake Storage Gen2
- Analytics Ready: Captured data accessible by Azure Synapse, Databricks, Spark for advanced analytics
- Containerized Deployment: Production-ready Docker and Docker Compose configurations
- Kubernetes Ready: Includes K8s deployment manifests for cloud-native environments
- Azure Application Insights: Pre-configured for Azure Monitor integration with sampling
- Extensible Design: Ready to add multiple exporters with different sampling strategies
- Sample Mobile App: Reference Node.js application generating 1000 activities with correlated traces and logs
graph LR
subgraph Mobile[Mobile App]
App[OTLP <br/>Exporters]
end
App -->|HTTP with<br/>custom headers| Receiver
subgraph Collector[Custom OTel Collector]
Receiver[OTLP<br/>Receiver]
subgraph Processors[Processors]
Memory[Memory<br/>Limiter]
TrustGW[Trust <br/>Gateway]
Sampler[Probabilistic<br/>Sampler<br/>]
Batch1[Batch<br/>Sampled]
Batch2[Batch<br/>Full]
end
subgraph Exporters[Exporters]
Azure[Azure <br/>App Insights]
EventHub[Azure<br/>Event Hubs]
end
Receiver --> Memory
Memory --> TrustGW
TrustGW --> Sampler
Sampler -->|~10% of correlated<br/> traces and logs| Batch1
Batch1 --> Azure
TrustGW -->|100% of data| Batch2
Batch2 --> |metrics|Azure
Batch2 -.-> |traces, metrics and logs|EventHub
end
style Azure fill:#0078d4,stroke:#333,stroke-width:2px,color:#fff
style EventHub fill:#0078d4,stroke:#333,stroke-width:2px,color:#fff
Note
Optional Data Lake Integration: Azure Event Hubs can be configured with Event Hubs Capture to automatically archive raw telemetry data to Azure Data Lake Storage Gen2 for long-term retention and analytics.
The trust gateway processor (processor/trustgatewayprocessor) acts as a security checkpoint in the telemetry pipeline:
Validation Steps:
- Header Presence Check: Verifies all required headers exist in resource attributes
- API Key Verification: Validates the API key against a configured whitelist (simulating device enrollment database lookup)
- Telemetry Rejection: Drops data from unverified sources with detailed logging for security audits
How It Works:
- Intercepts telemetry data at the processor stage (after receiver, before export)
- Extracts device identity claims from OTLP resource attributes
- In a production system, this would validate cryptographic signatures; here we use API keys for demonstration
- Failed validation prevents data from reaching exporters, reducing noise and potential security risks
The collector implements a sophisticated sampling strategy designed to handle large volumes of telemetry data while controlling costs:
Multiple Pipelines with Different Strategies:
-
Sampled Pipeline (Traces + Logs) → Azure Application Insights
- 10% probabilistic sampling (configurable)
- Traces and logs are correlated via
trace_id - When a trace is sampled, all its logs are kept together
- Perfect for cost-effective cloud storage while maintaining visibility
-
Full Pipeline (Metrics) → Azure Application Insights
- 100% of metrics sent (no sampling)
- Critical for accurate dashboards, alerts, and SLOs
- Metrics typically have lower volume than traces/logs
-
Full Data Pipeline → Azure Event Hubs
- Send 100% of all telemetry (traces, logs, metrics) to Azure Event Hubs
- Real-time event streaming for downstream processing and analytics
- Data exported in Parquet format with Snappy compression
- Optional: Enable Event Hubs Capture to automatically archive to Azure Data Lake Storage Gen2
- Correlation: Logs inherit
trace_idfrom their parent spans, ensuring sampling keeps related data together - Flexibility: Different sampling rates per pipeline - aggressive for traces/logs, none for metrics
- Multi-Destination: Send sampled data to Application Insights for real-time monitoring, full data to Event Hubs for stream processing
- Cost Control: Handle millions of events per day while keeping cloud ingestion costs predictable
- Observability: 10% sampling still provides statistical significance for most use cases
- Event Streaming: Event Hubs enables real-time analytics, event-driven architectures, and optional data lake archival
- Go 1.24 or later - Required to build the collector from source
- Docker & Docker Compose (optional) - For containerized deployment
- Node.js 20+ (optional) - Only needed to run the sample mobile app
- Azure Application Insights for sampled telemetry data
- Azure Event Hubs for streaming full telemetry data
Build the OTel Collector:
cd src/otel-collector
go build -o otelcol-custom .Configure environment variables:
# Required: Azure Application Insights connection string
export APPLICATIONINSIGHTS_CONNECTION_STRING="InstrumentationKey=YOUR-KEY;IngestionEndpoint=https://..."
# Required: Azure Event Hubs namespace URL (for managed identity/service principal auth)
export EVENTHUBS_NAMESPACE_URL="https://YOUR-NAMESPACE.servicebus.windows.net"
# Optional: For Event Hubs connection string authentication (alternative to managed identity)
# export EVENTHUBS_CONNECTION_STRING="Endpoint=sb://YOUR-NAMESPACE.servicebus.windows.net/;SharedAccessKeyName=...;SharedAccessKey=..."
# Optional: For Azure DefaultAzureCredential (service principal)
# export AZURE_TENANT_ID="your-tenant-id"
# export AZURE_CLIENT_ID="your-client-id"
# export AZURE_CLIENT_SECRET="your-client-secret"Run the Collector:
./otelcol-custom --config config.yamlThe collector will start and listen on the following ports:
- 4317 (OTLP gRPC)
- 4318 (OTLP HTTP)
- 13133 (Health check)
Verify it's running:
curl http://localhost:13133Use Docker for a containerized deployment with easier configuration management.
Configure environment variables:
cd src/otel-collector
# Copy the example environment file
cp .env.example .env
# Edit .env and add your configuration:
# APPLICATIONINSIGHTS_CONNECTION_STRING="InstrumentationKey=YOUR-KEY;IngestionEndpoint=https://..."
# EVENTHUBS_NAMESPACE_URL="https://YOUR-NAMESPACE.servicebus.windows.net"
#
# Optional: For connection string auth instead of managed identity
# EVENTHUBS_CONNECTION_STRING="Endpoint=sb://YOUR-NAMESPACE.servicebus.windows.net/;SharedAccessKeyName=...;SharedAccessKey=..."
#
# Optional: For service principal authentication
# AZURE_TENANT_ID="your-tenant-id"
# AZURE_CLIENT_ID="your-client-id"
# AZURE_CLIENT_SECRET="your-client-secret"Build and Run with Docker Compose:
docker-compose upDocker Compose will:
- Build the collector image
- Load environment variables from
.envfile - Start the collector with proper port mappings
- Automatically restart on failure
Verify if it's running:
curl http://localhost:13133The src/mobile-app directory contains a Node.js application demonstrating how to send telemetry with custom headers.
cd src/mobile-app
npm install# With default settings
npm start
# With custom configuration
COLLECTOR_URL=http://localhost:4318 \
API_KEY=mobile-app-secret-key-123 \
APP_TOKEN=my-mobile-app-token \
npm start- Initializes OpenTelemetry SDK: Configures OTLP exporters for traces, metrics, and logs
- Sets Device Identity Attributes: Attaches API key and app token as resource attributes (simulating device enrollment data)
- Generates Sample Telemetry: Creates traces and metrics to demonstrate the full pipeline
- Tests Validation: Demonstrates both valid (authorized) and invalid (rejected) scenarios
- Observability: Shows how properly authenticated telemetry flows through the trust gateway
Key Integration Patterns:
- Custom resource attributes carry device identity claims
- HTTP headers are mapped to OTLP resource attributes
- Error handling demonstrates graceful degradation when validation fails
# Start the collector
cd src/otel-collector
./otelcol-custom --config config.yaml
# In another terminal, run the mobile app
cd src/mobile-app
npm startYou should see telemetry data being processed and logged by the collector.
Modify the mobile app to use an invalid API key:
API_KEY=invalid-key npm startThe collector will reject the telemetry data, and you'll see validation warnings in the collector logs.
| Parameter | Description | Default |
|---|---|---|
required_headers |
List of headers that must be present | ["X-App-Token"] |
valid_api_keys |
Whitelist of valid API keys | [] |
| Environment Variable | Description | Default |
|---|---|---|
COLLECTOR_URL |
OTel collector endpoint | http://localhost:4318 |
API_KEY |
API key for authentication | mobile-app-secret-key-123 |
APP_TOKEN |
Application token | my-mobile-app-token |
| Port | Protocol | Description |
|---|---|---|
| 4317 | gRPC | OTLP gRPC receiver |
| 4318 | HTTP | OTLP HTTP receiver |
| 13133 | HTTP | Health check endpoint |
- Create Processor Directory:
mkdir -p processor/myprocessor - Implement Required Files:
config.go: Define configuration structfactory.go: Implement processor factory interfaceprocessor.go: Core processing logic
- Register in Collector: Import and add to builder in
main.go - Configure Pipeline: Add processor to
config.yamlservice pipelines - Test: Write unit tests and integration tests
Example:
// In main.go
import "custom-otel-collector/processor/myprocessor"
// Add to WithProcessors
.WithProcessors(
myprocessor.NewFactory(),
// ... other processors
)To send data to external systems, add exporters to main.go and config.yaml:
exporters:
otlp:
endpoint: "external-collector:4317"
tls:
insecure: false
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, trustgateway, batch]
exporters: [debug, otlp] # Add your exporter hereModify processor/trustgatewayprocessor/processor.go to add custom validation:
func (p *trustGatewayProcessor) validateTelemetry(resources interface{}) error {
// Add your custom validation logic here
// For example: check IP allowlists, rate limiting, etc.
}- Check that the collector is running:
curl http://localhost:13133 - Verify the mobile app is pointing to the correct URL
- Check for firewall rules blocking ports 4317/4318
- Verify the API key in the mobile app matches one in
valid_api_keys - Check collector logs for validation warnings
- Ensure custom headers are being sent (check network requests)
- Ensure Go modules are properly initialized
- Run
go mod tidybefore building - Check Docker daemon is running