Skip to content

Conversation

@codefromthecrypt
Copy link
Contributor

What does this PR do?

This allows llama-stack users of the Docker image to use OpenTelemetry like previous versions.

#4127 migrated to automatic instrumentation, but unless we add those libraries to the image, everyone needs to build a custom image to enable otel. Also, unless we establish a convention for enabling it, users who formerly just set config now need to override the entrypoint.

This PR bootstraps OTEL packages, so they are available (only +10MB). It also prefixes llama stack run with opentelemetry-instrument when any OTEL_* environment variable is set.

The result is implicit tracing like before, where you don't need a custom image to use traces or metrics.

Test Plan

# Build image
docker build -f containers/Containerfile \
  --build-arg DISTRO_NAME=starter \
  --build-arg INSTALL_MODE=editable \
  --tag llamastack/distribution-starter:otel-test .

# Run with OTEL env to implicitly use `opentelemetry-instrument`. The
# Settings below ensure inbound traces are honored, but no
# "junk traces" like SQL connects are created.
docker run -p 8321:8321 \
  -e OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4318 \
  -e OTEL_SERVICE_NAME=llama-stack \
  -e OTEL_TRACES_SAMPLER=parentbased_traceidratio \
  -e OTEL_TRACES_SAMPLER_ARG=0.0 \
  llamastack/distribution-starter:otel-test

Ran a sample flight search agent which is instrumented on the client side. This and llama-stack target otel-tui I verified no root database spans, yet database spans are attached to incoming traces.

screenshot

Builds on llamastack#4127 by adding OpenTelemetry auto-instrumentation support to Docker images. After llamastack#4127 migrated to automatic instrumentation, the Docker images lacked the necessary dependencies. This PR installs the OTEL packages and enables instrumentation when any OTEL_* environment variable is set.

Test Plan:

Build image:
docker build -f containers/Containerfile   --build-arg DISTRO_NAME=starter   --build-arg INSTALL_MODE=editable   --tag llamastack/distribution-starter:otel-test .

Run with trace propagation enabled (parentbased_traceidratio with 0.0 prevents new traces but allows propagation of incoming traces):
docker run -p 8321:8321   -e OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4318   -e OTEL_SERVICE_NAME=llama-stack   -e OTEL_TRACES_SAMPLER=parentbased_traceidratio   -e OTEL_TRACES_SAMPLER_ARG=0.0   llamastack/distribution-starter:otel-test

Ran a sample flight search agent. Traces successfully captured.

Signed-off-by: Adrian Cole <[email protected]>
Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. makes sense!

@ashwinb ashwinb merged commit 4237eb4 into llamastack:main Dec 3, 2025
28 checks passed
@codefromthecrypt codefromthecrypt deleted the feat/otel-docker-instrumentation branch December 3, 2025 02:06
r-bit-rry pushed a commit to r-bit-rry/llama-stack that referenced this pull request Dec 3, 2025
…lamastack#4281)

# What does this PR do?

This allows llama-stack users of the Docker image to use OpenTelemetry
like previous versions.

llamastack#4127 migrated to automatic instrumentation, but unless we add those
libraries to the image, everyone needs to build a custom image to enable
otel. Also, unless we establish a convention for enabling it, users who
formerly just set config now need to override the entrypoint.

This PR bootstraps OTEL packages, so they are available (only +10MB). It
also prefixes `llama stack run` with `opentelemetry-instrument` when any
`OTEL_*` environment variable is set.

The result is implicit tracing like before, where you don't need a
custom image to use traces or metrics.

## Test Plan

```bash
# Build image
docker build -f containers/Containerfile \
  --build-arg DISTRO_NAME=starter \
  --build-arg INSTALL_MODE=editable \
  --tag llamastack/distribution-starter:otel-test .

# Run with OTEL env to implicitly use `opentelemetry-instrument`. The
# Settings below ensure inbound traces are honored, but no
# "junk traces" like SQL connects are created.
docker run -p 8321:8321 \
  -e OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4318 \
  -e OTEL_SERVICE_NAME=llama-stack \
  -e OTEL_TRACES_SAMPLER=parentbased_traceidratio \
  -e OTEL_TRACES_SAMPLER_ARG=0.0 \
  llamastack/distribution-starter:otel-test
```

Ran a sample flight search agent which is instrumented on the client
side. This and llama-stack target
[otel-tui](https://github.com/ymtdzzz/otel-tui) I verified no root
database spans, yet database spans are attached to incoming traces.


<img width="1608" height="742" alt="screenshot"
src="https://github.com/user-attachments/assets/69f59b74-3054-42cd-947d-a6c0d9472a7c"
/>

Signed-off-by: Adrian Cole <[email protected]>
r-bit-rry pushed a commit to r-bit-rry/llama-stack that referenced this pull request Dec 4, 2025
…lamastack#4281)

# What does this PR do?

This allows llama-stack users of the Docker image to use OpenTelemetry
like previous versions.

llamastack#4127 migrated to automatic instrumentation, but unless we add those
libraries to the image, everyone needs to build a custom image to enable
otel. Also, unless we establish a convention for enabling it, users who
formerly just set config now need to override the entrypoint.

This PR bootstraps OTEL packages, so they are available (only +10MB). It
also prefixes `llama stack run` with `opentelemetry-instrument` when any
`OTEL_*` environment variable is set.

The result is implicit tracing like before, where you don't need a
custom image to use traces or metrics.

## Test Plan

```bash
# Build image
docker build -f containers/Containerfile \
  --build-arg DISTRO_NAME=starter \
  --build-arg INSTALL_MODE=editable \
  --tag llamastack/distribution-starter:otel-test .

# Run with OTEL env to implicitly use `opentelemetry-instrument`. The
# Settings below ensure inbound traces are honored, but no
# "junk traces" like SQL connects are created.
docker run -p 8321:8321 \
  -e OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4318 \
  -e OTEL_SERVICE_NAME=llama-stack \
  -e OTEL_TRACES_SAMPLER=parentbased_traceidratio \
  -e OTEL_TRACES_SAMPLER_ARG=0.0 \
  llamastack/distribution-starter:otel-test
```

Ran a sample flight search agent which is instrumented on the client
side. This and llama-stack target
[otel-tui](https://github.com/ymtdzzz/otel-tui) I verified no root
database spans, yet database spans are attached to incoming traces.


<img width="1608" height="742" alt="screenshot"
src="https://github.com/user-attachments/assets/69f59b74-3054-42cd-947d-a6c0d9472a7c"
/>

Signed-off-by: Adrian Cole <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants