Hello,
I'm trying to get a better understanding of how Filebeat handles multiple inputs, especially in high-volume scenarios.
1. Input Consumption Behavior
If Filebeat is reading from multiple log files—some high-volume and some low-volume—does it process these inputs evenly, or is input consumption proportional to the volume of logs generated? In other words, can low-volume files get "starved" if high-volume files are constantly producing data?
2. Quality of Service / Fairness
Is there a way to ensure fair processing or equal priority between inputs? For example, if one file is generating a million log lines, can Filebeat still reliably pick up and forward new lines from a low-volume file without delay?
3. Our Use Case: Kubernetes Logging
We run a Kubernetes cluster where all logs are shipped to a single worker node. Filebeat runs on that node and forwards the logs to Logstash.
We’re concerned about the following scenario:
If one application starts producing an excessive amount of logs (e.g., 100x its normal rate), how can we prevent Filebeat from being overwhelmed by these logs?
- Is it possible to configure Filebeat to discard such excessive logs based on volume, source, or content?
- More importantly, how can we ensure that logs from other applications continue to be collected and forwarded, even under such stress?
Any guidance or best practices for configuring Filebeat to handle these cases would be greatly appreciated.
Thank you!
Vojtech