-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add support for awaiting proxy readiness #5967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Kevin Leimkuhler <[email protected]>
Signed-off-by: Kevin Leimkuhler <[email protected]>
|
wdyt about calling the annotation |
Signed-off-by: Kevin Leimkuhler <[email protected]>
Signed-off-by: Kevin Leimkuhler <[email protected]>
Signed-off-by: Kevin Leimkuhler <[email protected]>
|
Values for the annotation have been changed to I'll keep this draft until linkerd/linkerd-await#22 merges, but should be good for review after. |
`CMD` is not required if the use-case of `linkerd-await` is only to wait for readiness but do nothing afterwards. In linkerd/linkerd2#5967 this is the case. `linkerd-await` is executed as a container hook and the only thing that it needs to do is prevent the hook from finishing until the proxy is ready. Once it is ready, it exits without running any additional commands. Signed-off-by: Kevin Leimkuhler <[email protected]>
Signed-off-by: Kevin Leimkuhler <[email protected]>
adleong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so cool!
| RUN (proxy=$(bin/fetch-proxy $(cat proxy-version) $TARGETARCH) && \ | ||
| mv "$proxy" linkerd2-proxy) | ||
| ARG LINKERD_AWAIT_VERSION=v0.2.3 | ||
| RUN curl -fsSvLo linkerd-await https://github.com/linkerd/linkerd-await/releases/download/release%2F${LINKERD_AWAIT_VERSION}/linkerd-await-${LINKERD_AWAIT_VERSION}-${TARGETARCH} && chmod +x linkerd-await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to have linkerd-await publish a docker image and then pull the binary from the docker image instead of from the releases page?
|
Converting back to draft after some additional discussion off GH yesterday. I'll be removing the configuration options for this so that it is an always-enabled feature. |
|
Making the linkerd-proxy container the first (0) image might confuse people and potentially break behaviour for others. |
|
Thanks @electrical! Are there any specific examples you can point to of behaviors or UIs which rely on container ordering and would be negatively impacted by moving the linkerd-proxy container to index 0? |
|
@adleong Rancher is a good example where in the different overview pages ( Pod, deployment, statefulsets, etc ) it shows the index 0 container. If it would show the linkerd-proxy container there I specifically need to go into the details of that item to see details about the container I actually care about and thus defeating the purpose of that overview. In the case of other systems like Kyverno i assume with my own deployments that my container is in index 0 and do certain mutations with that. ( it can only do it based on index, not name ) These are the only 2 examples that I have / work with. Not sure if there are other systems that depend on the ordering. |
|
@electrical Thanks this is helpful! So those examples are assuming the index of the container, but not actually requiring it to be first. For the Rancher case, I assume there is some way to configure it to either look at a different index or look by container name. For the Kyverno case, this again sounds like a scenario where looking by container name is a safer thing to do. I say this because I think there is an important distinction between applications that require being the first container, and applications that are only assuming so. For this feature to work, the proxy requires that it is the first container. As I stated above, right now the plan is to make this feature always-enabled and not configurable. Do you think if this was always-enabled but you could configure it, that you'd find yourself disabling it for those examples you listed? If so, I think this may be a good reason to at least make this configurable in the first iteration. As it stays in edges though and we move closer to a next stable, maybe that will give more time for changes to applications that make these assumptions. |
|
@kleimkuhler In my case I would disable it yeah. Partially for the fact the linkerd-proxy becomes index 0 but also I'm not sure linkerd should do this. |
|
@electrical Yep that makes sense. The biggest case that is solved by making it always enabled is when users have some external image that they cannot—or do not wish to—build themselves so that it is wrapped by If it is kept configurable, the other question is if it is enabled or disabled by default. For edges it's probably better to make it enabled by default. That way, we have more a chance of running into cases where this does affect a broader set of applications. Once we get closer to a stable and have a better idea about how helpful this feature is, we can make the call on what its default behavior is in the stable. |
|
@kleimkuhler I completely understand and happy with the path you set out :-) |
Signed-off-by: Kevin Leimkuhler <[email protected]>
…wait Signed-off-by: Kevin Leimkuhler <[email protected]>
Signed-off-by: Kevin Leimkuhler <[email protected]>
|
This comment has been copied up into the PR description There has been some additional discussion both off GitHub as well as on this PR (specifically with @electrical). First, we decided that this feature should be enabled by default. The reason for this is more often than not, this feature will prevent start-up ordering issues from occurring without having any negative effects on the application. Additionally, this will be a part of edges up until the 2.11 (the next stable release) and having it enabled by default will allow us to check that it does not conflict often with applications. Once we are closer to 2.11, we'll be able to determine if this should be disabled by default because it causes more issues than it prevents. Second, this feature will remain configurable; if disabled, then upon injection the proxy container will not be made the first container in the pod manifest. This is important for the reasons discussed with @electrical about tools that make assumptions about app containers being the first container. For example, Rancher defaults to showing overview pages for the |
| {{- $r := merge .Values.publicAPIProxyResources .Values.proxy.resources }} | ||
| {{- $_ := set $tree.Values.proxy "resources" $r }} | ||
| {{- end }} | ||
| {{- $_ := set $tree.Values.proxy "await" true }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because core components are not admitted by the proxy-injector, we cannot rely on annotating these components with config.linkerd.io/proxy-await: "enabled".
Therefore, the template must override Values here so that proxy.await is always true. This ensures that when users install Linkerd and explicitly disable this feature for their application, the control plane still has the feature enabled.
This is true for templates/{destination.yaml, proxy-injector.yaml, sp-validator.yaml}.
|
What would be super cool is if @mateiidavid's #6002 merges before this. That way, the Viz and Jaeger extensions that currently have the |
alpeb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! 👍
TIOLI: I noted the identity workload's proxy container will have the post-start hook. The main container will be triggered first, so in theory linkerd-await shouldn't block anything. But it might be worth adding {{- $_ := set $tree.Values.proxy "await" false }} in that case before calling the proxy partial, just to avoid the unnecessary call 🤷♂️
Signed-off-by: Kevin Leimkuhler <[email protected]>
|
@alpeb Good call; it's safer to explicitly disable the hook rather than relying off the container ordering. It has been added. |
Signed-off-by: Kevin Leimkuhler <[email protected]>
adleong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor mismatch with the annotation boolean format. Otherwise looks good!
Signed-off-by: Kevin Leimkuhler <[email protected]>
…wait Signed-off-by: Kevin Leimkuhler <[email protected]>
…wait Signed-off-by: Kevin Leimkuhler <[email protected]>
Signed-off-by: Kevin Leimkuhler <[email protected]>
Signed-off-by: Kevin Leimkuhler <[email protected]>
What
This change adds the
config.linkerd.io/proxy-awaitannotation which when set will delay application container start until the proxy is ready. This allows users to force application containers to wait for the proxy container to be ready without modifying the application's Docker image. This is different from the current use-case of linkerd-await which does require modifying the image.To support this, Linkerd is using the fact that containers are started in the order that they appear in
spec.containers. Iflinkerd-proxyis the first container, then it will be started first.Kubernetes will start each container without waiting on the result of the previous container. However, if a container has a hook that is executed immediately after container creation, then Kubernetes will wait on the result of that hook before creating the next container. Using a
PostStarthook in thelinkerd-proxycontainer, thelinkerd-awaitbinary can be run and force Kubernetes to pause container creation until the proxy is ready. Oncelinkerd-awaitcompletes, the container hook completes and the application container is created.Adding the
config.linkerd.io/await-proxyannotation to a pod's metadata results in thelinkerd-proxycontainer being the first container, as well as having the container hook:Update after draft
There has been some additional discussion both off GitHub as well as on this PR (specifically with @electrical).
First, we decided that this feature should be enabled by default. The reason for this is more often than not, this feature will prevent start-up ordering issues from occurring without having any negative effects on the application. Additionally, this will be a part of edges up until the 2.11 (the next stable release) and having it enabled by default will allow us to check that it does not conflict often with applications. Once we are closer to 2.11, we'll be able to determine if this should be disabled by default because it causes more issues than it prevents.
Second, this feature will remain configurable; if disabled, then upon injection the proxy container will not be made the first container in the pod manifest. This is important for the reasons discussed with @electrical about tools that make assumptions about app containers being the first container. For example, Rancher defaults to showing overview pages for the
0index container, and if the proxy container was always0then this would defeat the purpose of the overview page.Testing
To test this I used the
sleep.shscript and changedDockerfile-proxyto use it as it'sENTRYPOINT. This forces the container to sleep for 20 seconds before starting the proxy.sleep.sh:Dockerfile-proxy:Annotate the
emojideployment so that it's the only workload that should wait for it's proxy to be ready and inject it:You can then see that the
emojideployment is not starting its application container until the proxy is ready:Signed-off-by: Kevin Leimkuhler [email protected]