Skip to content

New component: k8slog receiver #23339

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 tasks
h0cheung opened this issue Jun 13, 2023 · 24 comments
Open
2 tasks

New component: k8slog receiver #23339

h0cheung opened this issue Jun 13, 2023 · 24 comments
Labels
Accepted Component New component has been sponsored

Comments

@h0cheung
Copy link
Contributor

h0cheung commented Jun 13, 2023

The purpose and use-cases of the new component

The existing filelog receiver can find and collect logs in files based on configured path expressions.

However, for k8s containers, we need a special path to access the files in the container. It is not convenient to match the corresponding path directly, and difficult to get its metadata or filter it.

The sidecar solution brings additional problems such as deployment troubles and wasted resources. At present, there is no solution in otel that meets our needs in all aspects.

So we decided to develop a new component: k8slog receiver, to provide full functionality to collect logs in k8s containers. It is an all-in-one solution, which supports to:

  • collect logs from files inside k8s containers via daemonset
  • support docker and cri-containerd
  • support many docker graph drivers / snapshotters
  • collect logs from stdout of k8s containers via daemonset
  • filter containers by metadata
  • extract metadata of containers
  • collect file logs in k8s containers via sidecar

If we want to collect logs from k8s, we can use this component.

Example configuration for the component

receivers:
  k8slog:
    discovery:
      mode: 1 # choices:
              # 1 (default): collect logs from files inside containers via daemonset
              # 2: collect logs from stdout of containers via daemonset
              # 3: collect logs from files via sidecar
      host_root: /host-root # default. we need to mount / of host in the container as a subpath. Read-only is enough.
      runtime_apis: # APIs to be used for discovering pods. By default it will try to connect to both docker and cri-containerd.
        type: docker
        addr: unix:///<host_root>/var/run/docker.sock # default. address of docker api.
        containerd_addr: unix:///<host_root>/run/docker/containerd/containerd.sock # default. address of underlying containerd api.
      filter: # filters for pods. By default all containers will be included.
        annotations:
        labels:
        namespaces:
        containers:
        pods:
        uids:
        expr: # custom expression (using antonmedv/expr)
      extract: # extrace resource attributes, like k8sattributes processor.
        metadata:
        env:
        annotations:
        labels:
    # configs below are (almost) the same as filelog receiver.
    include: # In `daemonset-file` mode, these paths are paths inside the contaienrs. In `daemonset-stdout` mode, these paths is useless.
      - /var/log/*.log
    exclude:
    start_at:
    multiline:
    force_flush_period:
    encoding:
    include_file_name:
    poll_interval:
    fingerprint_size:
    max_log_size:
    max_concurrent_files:
    attributes:
    resource:
    operators:
    storage:
    preserve_leading_whitespaces:
    preserve_trailing_whitespaces:
    max_batches:
    header:
    retry_on_failure:

Telemetry data types supported

logs

Is this a vendor-specific component?

  • This is a vendor-specific component
  • If this is a vendor-specific component, I am proposing to contribute this as a representative of the vendor.

Sponsor (optional)

@TylerHelmuth

Additional context

No response

@h0cheung h0cheung added the needs triage New item requiring triage label Jun 13, 2023
@h0cheung h0cheung changed the title New component: New component: k8slog receiver Jun 13, 2023
@songy23 songy23 added the Sponsor Needed New component seeking sponsor label Jun 14, 2023
@djaglowski djaglowski removed the needs triage New item requiring triage label Jun 28, 2023
@TylerHelmuth
Copy link
Member

@h0cheung I will sponsor this component.

A couple comments on the proposal:

  • lets not introduce expr at the start. If we need custom expression I would prefer to use OTTL, but we could introduce it later.
  • for mode, lets you strings instead of integers as they make it easier for users reading configuration to understand what the component is doing.
  • Do we definitely need to include configuration option if we already have the filter configuration section? Could we combine them in some way?
  • I think it would be nice to nest the filelogreceiver configuration options under another section instead of placing them all in root.

@TylerHelmuth TylerHelmuth added Accepted Component New component has been sponsored and removed Sponsor Needed New component seeking sponsor labels Aug 14, 2023
@djaglowski
Copy link
Member

djaglowski commented Aug 15, 2023

I generally think this is a good idea but strongly believe we need to depend on the fileconsumer package for tracking and reading log files. It appears this is the intention, but I'd like to understand what if any differences you foresee.

configs below are (almost) the same as filelog receiver.

Can you clarify in which ways it is different? Is it only in the way the include paths are interpretted?

When reading from files, to what extent should users be able to apply operators? If there is some pre-configured set of operators that handles parsing, would a user be able to change these? Would they be able to place operators before and/or after the pre-defined operators?

@TylerHelmuth
Copy link
Member

@djaglowski I agree that we should use fileconsumer.

I believe this component will pre-configure some operators based on kubernetes requirements. Since this is an opinionated receiver, I don't think those should be configurable (at least not the operators themselves - the proposal is using a different interface to configure how the resource attributes should be set and I like separating this config from the operator itself).

Users should be able to add operators after the component's pre-determined manipulation/setup.

@h0cheung
Copy link
Contributor Author

h0cheung commented Aug 20, 2023

@djaglowski @TylerHelmuth

As planned, I will write this component from scratch. Some of the configuration items are the same as filelog, this is to make it easier for users to get started, it does not mean that the same implementation is used. There are several reasons for this:

  • fileconsumer currently has some imperfections, such as [pkg/stanza/fileconsumer] Add ability to read files asynchronously #23056 tries to address. It's not easy to make large changes for already released components, but for new components I want to do as good a job as possible from the start.
  • For symbolic links, we need special handling based on the mount point mapping when looking for files.
  • In daemonset-stdout mode, the file path of the logs is completely defined, no glob is needed. So we can manage the reader directly after getting the file path, no need for resource consuming mechanisms like fingerprint, just handle the log rotation.

For efficiency and flexibility, parsing the k8s standard output and splicing containerd logs (based on "T" and "F" tags) will be implemented in separate code. There are no preconfigured operators.

@djaglowski
Copy link
Member

As planned, I will write this component from scratch. Some of the configuration items are the same as filelog, this is to make it easier for users to get started, it does not mean that the same implementation is used.

fileconsumer currently has some imperfections, such as #23056 tries to address. It's not easy to make large changes for already released components, but for new components I want to do as good a job as possible from the start.

I strongly disagree with the idea of writing a completely separate implementation for file tracking and ingestion. We should address imperfections in pkg/stanza/fileconsumer rather than maintaining two separate implementations. For the particular issue mentioned here, this is a performance enhancement for an edge case - hardly a reason to reroll a major package and maintain two implementations.

For symbolic links, we need special handling based on the mount point mapping when looking for files.

I recommend we define an interface or function that can be passed to fileconsumer to customize behavior for handling symlinks. We can have a default behavior and another that is substituted by k8slog and perhaps reused by other components in the future.

In daemonset-stdout mode, the file path of the logs is completely defined, no glob is needed. So we can manage the reader directly after getting the file path, no need for resource consuming mechanisms like fingerprint, just handle the log rotation.

Similarly, I recommend we define an interface for finding files. We already have a matcher package which fully encapsulates the finding and filtering of files. It should be possible to substitute another implementation if necessary.

@TylerHelmuth
Copy link
Member

I agree with @djaglowski.

@h0cheung
Copy link
Contributor Author

h0cheung commented Aug 24, 2023

I think interfacing the components in fileconsumer is a great idea.

I think we can try to implement asynchronous file reading, optimized fingerprint matching, and optimized log rotation determination in the k8slog receiver's implementation of the fileconsumer interfaces.

I'll make changes to the init commit as soon as possible.

Let's make this component work together.

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Oct 24, 2023
@h0cheung
Copy link
Contributor Author

h0cheung commented Nov 6, 2023

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

remove stale

@github-actions github-actions bot removed the Stale label Nov 7, 2023
Copy link
Contributor

github-actions bot commented Jan 8, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Jan 8, 2024
@djaglowski djaglowski removed the Stale label Jan 8, 2024
@swiatekm
Copy link
Contributor

Looking at the proposed configuration, three questions come to mind.

  1. Why interface directly with the container runtime, as opposed to the kubelet? If we want to filter over Pod attributes like labels, does the container runtime even know about them?

  2. Mounting the whole host root, even as readonly, looks problematic from a security standpoint. Do we actually need anything outside of /var/log/pods?

  3. How is mode 1 supposed to work in terms of implementation?

@h0cheung
Copy link
Contributor Author

h0cheung commented Feb 20, 2024

Looking at the proposed configuration, three questions come to mind.

  1. Why interface directly with the container runtime, as opposed to the kubelet? If we want to filter over Pod attributes like labels, does the container runtime even know about them?
  2. Mounting the whole host root, even as read-only, looks problematic from a security standpoint. Do we actually need anything outside of /var/log/pods?
  3. How is mode 1 supposed to work in terms of implementation?
  1. In addition to the runtime api, we also use one of the kubeapi or kubelet.
  2. For deamonset-stdout mode there are often symbolic links under /var/log/pods. So, we need to mount at least the pointers to the symbolic links. For daemonset-file mode, we need to mount all the paths that are used as mount points for the container.
  3. Use the runtime api to get the mount point information (including the root) and then map the pattern outside the container to find the log files based on the pattern inside the container.

For example, for a docker container, it has the following mount points:

  • / -> /var/lib/docker/overlay2/xxxxxxxxxxxxx/merged, where the newly generated files are under /var/lib/docker/overlay2/xxxxxxxxxxxxx/upper
  • /logs -> /home/xxx/logs/

Also, the host root is mounted under /host-root on the otel container.
Then, when the user configures include=["/logs/*.log", "/var/log/*.log"], we can find log files by /host-root/home/xxx/logs/*.log and /host-root/var/lib/docker/overlay2/xxxxxxxxxxxxx/upper/var/log/*.log.
If symbolic links are involved, we need to recursively perform the above mount-point-based mapping process on the target of the link according to the mount point.

@swiatekm
Copy link
Contributor

What you want to do looks quite complex. I think I'd prefer if we started just with just mode 2, providing a nicer interface and more features over using filelog receiver directly for that use case, and then possibly expand.

My general feeling is that philosophically it's a better approach to have a sidecar simply print logs from files inside the container to stdout, and then handle them via the standard Kubernetes logging mechanism, rather then go around that mechanism by digging into container runtime internals.

@dmitryax
Copy link
Member

For example, for a docker container, it has the following mount points:

  • / -> /var/lib/docker/overlay2/xxxxxxxxxxxxx/merged, where the newly generated files are under /var/lib/docker/overlay2/xxxxxxxxxxxxx/upper
  • /logs -> /home/xxx/logs/

Why do we need docker runtime support? It's gone since k8s 1.24

@swiatekm
Copy link
Contributor

For example, for a docker container, it has the following mount points:

  • / -> /var/lib/docker/overlay2/xxxxxxxxxxxxx/merged, where the newly generated files are under /var/lib/docker/overlay2/xxxxxxxxxxxxx/upper
  • /logs -> /home/xxx/logs/

Why do we need docker runtime support? It's gone since k8s 1.24

There exist some popular cloud providers who offer extended support for K8s versions, one year after upstream EOL to be exact. They still support 1.23 until October 2024.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label May 27, 2024
@LH8PPL
Copy link

LH8PPL commented Jun 18, 2024

Is someone really working on this? do we know when will have a working receiver?

@github-actions github-actions bot removed the Stale label Jun 19, 2024
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Aug 19, 2024
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 18, 2024
atoulme pushed a commit that referenced this issue Mar 18, 2025
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
k8slog receiver provides full functionality to collect logs in k8s
containers. It is an all-in-one solution, which supports to:

- collect logs from files inside k8s containers via daemonset
- support docker and cri-containerd
- support many docker graph drivers / snapshotters
- collect logs from stdout of k8s containers via daemonset
- filter containers by metadata
- extract metadata of containers
- collect file logs in k8s containers via sidecar

If we want to collect logs from k8s, we can use this component.

**Link to tracking Issue:** <Issue number if applicable>

#23339

**Testing:** <Describe what testing was performed and which tests were
added.>

Tests for unmarshaling configuration have been added.

**Documentation:** <Describe the documentation added.>

Docs for configurations and overall design has been added.

---------

Co-authored-by: Chris Mark <[email protected]>
@odubajDT
Copy link
Contributor

Hi! Would like to reopen this issue (or open a new one), since this component might be very useful and it's already present in the repository (although not available in any of the collectors). Should the discussion about the additional features happen inside this issue or in a separate ones?

@h0cheung @dmitryax @TylerHelmuth @djaglowski

@odubajDT
Copy link
Contributor

odubajDT commented Apr 15, 2025

I agree with @djaglowski about using the usage of fileconsumer package in order to read logs from files.

Not 100% in favor of supporting the sidecar mode for now, since I assume, this will require additional RBAC permissions put into the full Pod where the sidecar is deployed, instead only for the sidecar container. For the first stage, the functionality can be achieved with the usage of DaemonSet, where we can isolate the RBAC permissions just for this particular workload. Are there any strong arguments for implementing this in the first stage?

Also, an additional note, what would be the difference of daemonset-stdout and daemonset-file? According to the the k8s specification, logs of pods/containers are stored in /var/log/pods, therefore with the usage of daemonset-stdout discovery mode, we basically read from file. Would the option be to unify these modes in order to make the configuration easier, where the default log path would be /var/log/pods/* unless defined otherwise (in case the logs are stored in persistent volume).

Open question: There is an alpha support of container log streams. I guess we may take this into account, at least for the future (not fan of supporting this unless it's not stable). Would this be something interesting?

Fiery-Fenix pushed a commit to Fiery-Fenix/opentelemetry-collector-contrib that referenced this issue Apr 24, 2025
…telemetry#24439)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
k8slog receiver provides full functionality to collect logs in k8s
containers. It is an all-in-one solution, which supports to:

- collect logs from files inside k8s containers via daemonset
- support docker and cri-containerd
- support many docker graph drivers / snapshotters
- collect logs from stdout of k8s containers via daemonset
- filter containers by metadata
- extract metadata of containers
- collect file logs in k8s containers via sidecar

If we want to collect logs from k8s, we can use this component.

**Link to tracking Issue:** <Issue number if applicable>

open-telemetry#23339

**Testing:** <Describe what testing was performed and which tests were
added.>

Tests for unmarshaling configuration have been added.

**Documentation:** <Describe the documentation added.>

Docs for configurations and overall design has been added.

---------

Co-authored-by: Chris Mark <[email protected]>
@azunna1
Copy link

azunna1 commented May 8, 2025

Is anything else required for this component to be moved to alpha

@h0cheung
Copy link
Contributor Author

h0cheung commented May 9, 2025

Is anything else required for this component to be moved to alpha

It hasn't been implemented yet.

@maxarndt
Copy link
Contributor

I am looking for a way to collect Kubernetes Pod logs from individual namespaces using the OTEL Collector. I am in a quite restrictive environment where I cannot deploy Daemonsets.

Is there a plan to offer a mode that, for example, retrieves logs from the Kubernetes API? The API for this is available.

Open question: There is an alpha support of container log streams. I guess we may take this into account, at least for the future (not fan of supporting this unless it's not stable). Would this be something interesting?

The feature that @odubajDT mentioned refers to the splitting of stdout and stderr logs. The current API returns both log streams, which could already be a good starting point for this extension.

Most other Kubernetes receivers in this project offer the possibility to use them only in one or selected namespaces, which works with restricted rights on a cluster. I think it would be a great addition if, alongside metrics and K8s events, Pod logs could also be collected at the namespace level.

What do you think about this idea?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Component New component has been sponsored
Projects
None yet
Development

No branches or pull requests