Skip to content

Conversation

@orestisfl
Copy link
Contributor

@orestisfl orestisfl commented Oct 27, 2025

Lazy Initialization of the Cache Processor's File Store

The Problem

The basic problem is that processors often use paths.Resolve to find directories like "data" or "logs". This function uses a global variable for the base path, which is fine when a Beat runs as a standalone process.

But when a Beat is embedded as a receiver (e.g., fbreceiver in the OTel Collector), this global causes problems. Each receiver needs its own isolated state directory, and a single global path prevents this.

The cache processor currently tries to set up its file-based store in its New function, which is too early. It only has access to the global path, not the receiver-specific path that gets configured later.

The Solution

My solution is to initialize the cache's file store lazily.

Instead of creating the store in cache.New, I've added a SetPaths(*paths.Path) method to the processor. This method creates the file store and is wrapped in a sync.Once to make sure it only runs once. The processor's internal store object stays nil until SetPaths is called during pipeline construction.

How it Works

The path info gets passed down when a client connects to the pipeline. Here's the flow:

  1. x-pack/filebeat/fbreceiver: createReceiver instantiates the processors (including cache with a nil store) and calls instance.NewBeatForReceiver.
  2. x-pack/libbeat/cmd/instance: NewBeatForReceiver creates the paths.Path object from the receiver's specific configuration.
  3. libbeat/publisher/pipeline: This paths.Path object is passed into the pipeline. When a client connects, the path is added to the beat.ProcessingConfig.
  4. libbeat/publisher/processing: The processing builder gets this config and calls group.SetPaths, which passes the path down to each processor.
  5. libbeat/processors/cache: SetPaths is finally called on the cache processor instance, and the sync.Once guard ensures the file store is created with the correct path.

Diagram

graph TD
    subgraph "libbeat/processors/cache (init)"
        A["init()"]
    end
    subgraph "libbeat/processors"
        B["processors.RegisterPlugin"]
        C{"registry"}
    end
    A --> B;
    B -- "Save factory" --> C;

    subgraph "x-pack/filebeat/fbreceiver"
        D["createReceiver"]
    end

    subgraph "libbeat/processors"
         E["processors.New(config)"]
         C -. "Lookup 'cache'" .-> E;
    end
    D --> E;
    D --> I;
    E --> G;

    subgraph "libbeat/processors/cache"
        G["cache.New()"] -- store=nil --> H{"cache"};
    end

    subgraph "x-pack/libbeat/cmd/instance"
        I["instance.NewBeatForReceiver"];
        I --> J{"paths.Path object"};
    end

    subgraph "libbeat/publisher/pipeline"
        J --> K["pipeline.New"];
        K --> L["ConnectWith"];
    end

    subgraph "libbeat/publisher/processing"
        L -- "Config w/ paths" --> N["builder.Create"];
        N --> O["group.SetPaths"];
    end

    subgraph "libbeat/processors/cache"
        O --> P["cache.SetPaths"];
        P --> Q["sync.Once"];
        Q -- "initialize store" --> H;
    end
Loading

Pros and Cons of This Approach

  • Pros:
    • It's a minimal, targeted change that solves the immediate problem.
    • It avoids a large-scale, breaking refactoring of all processors.
    • It maintains backward compatibility for existing processors and downstream consumers of libbeat.
  • Cons:
    • Using a type assertion for the setPaths interface feels a bit like magic, since the behavior changes at runtime depending on whether a processor implements it.

Alternatives Considered

Option 1: Add a paths argument to all processor constructors

  • Pros:
    • Simple and direct.
  • Cons:
    • Requires a global refactoring of all processors.
    • Breaks external downstream libbeat importers like Cloudbeat.
    • The paths argument is not needed in many processors, so adding a rarely used option to the function signature is verbose.

Option 2: Refactor processors to introduce a "V2" interface

  • Pros:
    • Allows for a new, backwards-compatible signature (e.g., using a config struct).
    • This can still be done later.
    • We can support both V1 processors and gradually move processors to V2.
  • Cons:
    • Needs a significant refactoring effort.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works. Where relevant, I have used the stresstest.sh script to run them under stress conditions and race detector to verify their stability.
  • I have added an entry in ./changelog/fragments using the changelog tool.

How to test this PR locally

Configuration

filebeat-cache-mwe.yml:

path.data: /tmp/data

filebeat.inputs:
  - type: filestream
    id: filestream-input
    enabled: true
    paths:
      - /tmp/logs/*.log
    parsers:
      - ndjson:
          target: ""

processors:
  # PUT: Store metadata when event.type is "source"
  - if:
      equals:
        event.type: "source"
    then:
      - cache:
          backend:
            file:
              id: test_cache
              write_interval: 5s
          put:
            key_field: event.id
            value_field: event.metadata
            ttl: 1h

  # GET: Retrieve metadata when event.type is "target"
  - if:
      equals:
        event.type: "target"
    then:
      - cache:
          backend:
            file:
              id: test_cache
          get:
            key_field: event.id
            target_field: cached_metadata

output.console:
  enabled: true

Setup

# Create directory
#rm -rf /tmp/data /tmp/logs
mkdir -p /tmp/logs

# Create test data
cat > /tmp/logs/test.log <<'EOF'
{"event":{"type":"source","id":"001","metadata":{"user":"user-1","role":"admin","sequence":1,"data":{"ip":"192.168.1.1","session":"session-001"}}},"message":"source event 1"}
{"event":{"type":"source","id":"002","metadata":{"user":"user-2","role":"admin","sequence":2,"data":{"ip":"192.168.1.2","session":"session-002"}}},"message":"source event 2"}
{"event":{"type":"source","id":"003","metadata":{"user":"user-3","role":"admin","sequence":3,"data":{"ip":"192.168.1.3","session":"session-003"}}},"message":"source event 3"}
{"event":{"type":"source","id":"004","metadata":{"user":"user-4","role":"admin","sequence":4,"data":{"ip":"192.168.1.4","session":"session-004"}}},"message":"source event 4"}
{"event":{"type":"source","id":"005","metadata":{"user":"user-5","role":"admin","sequence":5,"data":{"ip":"192.168.1.5","session":"session-005"}}},"message":"source event 5"}
{"event":{"type":"target","id":"001"},"message":"target event 1"}
{"event":{"type":"target","id":"002"},"message":"target event 2"}
{"event":{"type":"target","id":"003"},"message":"target event 3"}
{"event":{"type":"target","id":"004"},"message":"target event 4"}
{"event":{"type":"target","id":"005"},"message":"target event 5"}
EOF

# Run filebeat
./x-pack/filebeat/filebeat -e -c filebeat-cache-mwe.yml

Expected Output

Target events should have cached_metadata field populated:

{
  "event": {
    "type": "target",
    "id": "001"
  },
  "message": "target event 1",
  "cached_metadata": {
    "user": "user-1",
    "role": "admin",
    "sequence": 1,
    "data": {
      "ip": "192.168.1.1",
      "session": "session-001"
    }
  }
}

Cache Files

After running filebeat, check cache files:

cat /tmp/data/cache_processor/test_cache

example:

{"key":"001","val":{"data":{"ip":"192.168.1.1","session":"session-001"},"role":"admin","sequence":1,"user":"user-1"},"expires":"2025-11-20T15:02:32.865896537+01:00"}
{"key":"002","val":{"data":{"ip":"192.168.1.2","session":"session-002"},"role":"admin","sequence":2,"user":"user-2"},"expires":"2025-11-20T15:02:32.865950973+01:00"}
{"key":"003","val":{"data":{"ip":"192.168.1.3","session":"session-003"},"role":"admin","sequence":3,"user":"user-3"},"expires":"2025-11-20T15:02:32.865972408+01:00"}
{"key":"004","val":{"data":{"ip":"192.168.1.4","session":"session-004"},"role":"admin","sequence":4,"user":"user-4"},"expires":"2025-11-20T15:02:32.865988843+01:00"}
{"key":"005","val":{"data":{"ip":"192.168.1.5","session":"session-005"},"role":"admin","sequence":5,"user":"user-5"},"expires":"2025-11-20T15:02:32.866006958+01:00"}

Related issues

@orestisfl orestisfl self-assigned this Oct 27, 2025
@orestisfl orestisfl added enhancement backport-skip Skip notification from the automated backport with mergify Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team Draft labels Oct 27, 2025
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Oct 27, 2025
@github-actions
Copy link
Contributor

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@leehinman
Copy link
Contributor

One other idea I had was to stop registering the processors in the init function. And move that to something that is done inside beat configureafter the paths are initialized. For most processors we would just add the existing Constructor, but for ones that need a Path we could wrap them in a function that is a closure with the path set internally.

This has the advantage of getting rid of calls to init which slows down startup but it would mean we need a registry of processors per beat. It is definitely more invasive, but it does make the beat more independent. If we come across a second or third thing that needs to be unique among processors it would make adding those unique things easier.

@orestisfl orestisfl changed the title [cache-processor] WIP: SetPaths draft [cache-processor] Set beat Paths Nov 17, 2025
@orestisfl orestisfl changed the title [cache-processor] Set beat Paths [cache-processor] Set beat pths Nov 17, 2025
@orestisfl orestisfl changed the title [cache-processor] Set beat pths [cache-processor] Set beat paths Nov 17, 2025
@orestisfl orestisfl marked this pull request as ready for review November 20, 2025 13:11
@orestisfl orestisfl requested review from a team as code owners November 20, 2025 13:11
@orestisfl orestisfl requested review from faec and leehinman November 20, 2025 13:11
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@orestisfl orestisfl removed discuss Issue needs further discussion. Draft labels Nov 20, 2025
@orestisfl orestisfl requested a review from efd6 November 21, 2025 08:39
Copy link
Contributor

@leehinman leehinman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question on possible race condition, but otherwise LGTM


func (p *cache) Close() error {
p.cancel()
if p.cancel != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might have a race condition here. Is it possible that Close could be called before SetPaths? If so we I think we'll need a sync.Mutex or sync.Once to make sure we either go in the right order or don't do SetPaths if Close is called first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, perhaps a filebeat input can call it while filebeat is exiting. Given that this is problem is going to be shared across all processors, it could perhaps be better to implement it in safe_processor.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-skip Skip notification from the automated backport with mergify enhancement skip-changelog Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[beatreceiver] replace global paths in cache processor

5 participants