-
Notifications
You must be signed in to change notification settings - Fork 5k
Description
When Filebeat is restarted it can re-ingest files if hey have been rotated and the rotated paths are also monitored by Filestream.
Given the following Filestream configuration:
filebeat.inputs:
- type: filestream
id: oops-I-re-ingested-a-file
paths:
- /tmp/*.log
output.console:
enabled: true
pretty: true
And a file /tmp/foo.log
with some data.
- Create a file with some data
docker run -it --rm mingrammer/flog -n 2 > /tmp/flog.log
- Start Filebeat with the configuration above
- Wait until the file is fully ingested (no more events on the console)
- Stop Filebeat
- Move
/tmp/foo.log
to/tmp/foo-1.log
:mv /tmp/flog.log /tmp/flog-1.log
- Start Filebeat
Once Filebeat is restarted the file is re-ingested.
However if after moving /tmp/foo.log
to /tmp/foo-1.log
a new /tmp/foo.log
is created (the contents do not matter), like on a common log rotation strategy, no data is duplicated.
The actual issue comes from how the store clean up is implemented:
beats/filebeat/input/filestream/prospector.go
Lines 89 to 101 in 5449535
if p.cleanRemoved { | |
cleaner.CleanIf(func(v loginp.Value) bool { | |
var fm fileMeta | |
err := v.UnpackCursorMeta(&fm) | |
if err != nil { | |
// remove faulty entries | |
return true | |
} | |
_, ok := files[fm.Source] | |
return !ok | |
}) | |
} |
It checks if meta.source
from the registry entry matches any of the current files discovered by the filewatcher
if they do not match, then the entry is removed.