Skip to content

Limit written files to a maximum size in the file sink #2057

@binarylogic

Description

@binarylogic

When writing log data to a file it's common for users to want to limit files to a maximum size. Currently, this is not possible with Vector's file sink. There are 2 ways I could see approaching this:

1. Batching

Similar to the aws_s3 sink, we could batch data before flushing to a file.

Pros

  1. We'll write to the file in a single transaction, avoiding the need to deal with files being deleted, etc.
  2. It should be simpler to implement given that we already have batching support in place.

Cons

  1. Data is not immediately available in the file.
  2. It's inefficient. If the destination is the disk it seems strange to buffer data in memory, or on disk, before writing.

2. Streaming / Monitoring

Alternatively, we could monitor the destination file and rotate it when the size exceeds a limit.

Pros

  1. Much more efficient in terms of resource usage.
  2. Data is immediately available in the file.

Cons

  1. Seems more complex.
  2. How do we deal with files being removed out from under us?
  3. Is file rotation outside the scope of Vector? There are tools built specifically for this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs: approvalNeeds review & approval before work can begin.needs: more demandNeeds more demand before work can begin, +1 or comment to support.sink: fileAnything `file` sink relatedtype: enhancementA value-adding code change that enhances its existing functionality.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions