-
Notifications
You must be signed in to change notification settings - Fork 204
Description
Describe the enhancement:
Currently the Elastic Agent running in a container does not have a liveness HTTP endpoint where kubernetes can check the overall health of the Elastic Agent container. This needs to be added so that in the case that the Elastic Agent is not working correctly it can be restarted by Kubernetes.
Some items that would be good for the liveness probe to be alerted to on failure:
- Not able to connect to Fleet Server (in managed mode)
- Overall bad state of inputs after a period of time
Liveness probe should have some subpaths defined for inputs that need to monitor there own liveness:
/liveness/endpoint - Checks if endpoint should be alive (see https://github.com/elastic/security-team/issues/3449#issuecomment-1112559420) for more details
Describe a specific use case for the enhancement or feature:
This needs to be added so that in the case that the Elastic Agent (or an integration that runs in a sidecar) is not working correctly it can be restarted by Kubernetes.
Additional Requirements
Enabling the liveness endpoint requires the ability to enable and possibly modify the agent HTTP configuration. This is currently not reloadable and cannot be configured from Fleet. For Fleet managed users to benefit from this we should make sure this can be turned on from Fleet.
elastic-agent/elastic-agent.yml
Lines 75 to 82 in b3e8275
| # http: | |
| # # enables http endpoint | |
| # enabled: false | |
| # # The HTTP endpoint will bind to this hostname, IP address, unix socket or named pipe. | |
| # # When using IP addresses, it is recommended to only use localhost. | |
| # host: localhost | |
| # # Port on which the HTTP endpoint will bind. Default is 0 meaning feature is disabled. | |
| # port: 6791 |