Skip to content

Conversation

@kleimkuhler
Copy link
Contributor

@kleimkuhler kleimkuhler commented Mar 29, 2021

What

This change adds the config.linkerd.io/proxy-await annotation which when set will delay application container start until the proxy is ready. This allows users to force application containers to wait for the proxy container to be ready without modifying the application's Docker image. This is different from the current use-case of linkerd-await which does require modifying the image.


To support this, Linkerd is using the fact that containers are started in the order that they appear in spec.containers. If linkerd-proxy is the first container, then it will be started first.

Kubernetes will start each container without waiting on the result of the previous container. However, if a container has a hook that is executed immediately after container creation, then Kubernetes will wait on the result of that hook before creating the next container. Using a PostStart hook in the linkerd-proxy container, the linkerd-await binary can be run and force Kubernetes to pause container creation until the proxy is ready. Once linkerd-await completes, the container hook completes and the application container is created.

Adding the config.linkerd.io/await-proxy annotation to a pod's metadata results in the linkerd-proxy container being the first container, as well as having the container hook:

postStart:
  exec:
    command:
    - /usr/lib/linkerd/linkerd-await

Update after draft

There has been some additional discussion both off GitHub as well as on this PR (specifically with @electrical).

First, we decided that this feature should be enabled by default. The reason for this is more often than not, this feature will prevent start-up ordering issues from occurring without having any negative effects on the application. Additionally, this will be a part of edges up until the 2.11 (the next stable release) and having it enabled by default will allow us to check that it does not conflict often with applications. Once we are closer to 2.11, we'll be able to determine if this should be disabled by default because it causes more issues than it prevents.

Second, this feature will remain configurable; if disabled, then upon injection the proxy container will not be made the first container in the pod manifest. This is important for the reasons discussed with @electrical about tools that make assumptions about app containers being the first container. For example, Rancher defaults to showing overview pages for the 0 index container, and if the proxy container was always 0 then this would defeat the purpose of the overview page.

Testing

To test this I used the sleep.sh script and changed Dockerfile-proxy to use it as it's ENTRYPOINT. This forces the container to sleep for 20 seconds before starting the proxy.


sleep.sh:

#!/bin/bash
echo "sleeping..."
sleep 20
/usr/bin/linkerd2-proxy-run

Dockerfile-proxy:

...
COPY sleep.sh /sleep.sh
RUN ["chmod", "+x", "/sleep.sh"]
ENTRYPOINT ["/sleep.sh"]

# Build and install with the above changes
$ bin/docker-build
...
$ bin/image-load --k3d
...
$ bin/linkerd install |kubectl apply -f -

Annotate the emoji deployment so that it's the only workload that should wait for it's proxy to be ready and inject it:

cat emojivoto.yaml |bin/linkerd inject - |kubectl apply -f -

You can then see that the emoji deployment is not starting its application container until the proxy is ready:

$ kubectl get -n emojivoto pods
NAME                        READY   STATUS            RESTARTS   AGE
voting-ff4c54b8d-sjlnz      1/2     Running           0          9s
emoji-f985459b4-7mkzt       0/2     PodInitializing   0          9s
web-5f86686c4d-djzrz        1/2     Running           0          9s
vote-bot-6d7677bb68-mv452   1/2     Running           0          9s

Signed-off-by: Kevin Leimkuhler [email protected]

@olix0r
Copy link
Member

olix0r commented Mar 30, 2021

wdyt about calling the annotation config.linkerd.io/proxy-await -- this is more consistent with the existing annotations like proxy-log-level, etc

Signed-off-by: Kevin Leimkuhler <[email protected]>
Signed-off-by: Kevin Leimkuhler <[email protected]>
@kleimkuhler
Copy link
Contributor Author

kleimkuhler commented Mar 30, 2021

Values for the annotation have been changed to enabled or disabled. The annotation name has also been changed to config.linkerd.io/proxy-await.

I'll keep this draft until linkerd/linkerd-await#22 merges, but should be good for review after.

olix0r pushed a commit to linkerd/linkerd-await that referenced this pull request Mar 31, 2021
`CMD` is not required if the use-case of `linkerd-await` is only to wait for readiness
but do nothing afterwards.

In linkerd/linkerd2#5967 this is the case. `linkerd-await` is executed as a container
hook and the only thing that it needs to do is prevent the hook from finishing until
the proxy is ready. Once it is ready, it exits without running any additional commands.

Signed-off-by: Kevin Leimkuhler <[email protected]>
Signed-off-by: Kevin Leimkuhler <[email protected]>
@kleimkuhler kleimkuhler marked this pull request as ready for review March 31, 2021 20:19
@kleimkuhler kleimkuhler requested a review from a team as a code owner March 31, 2021 20:19
Copy link
Member

@adleong adleong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so cool!

RUN (proxy=$(bin/fetch-proxy $(cat proxy-version) $TARGETARCH) && \
mv "$proxy" linkerd2-proxy)
ARG LINKERD_AWAIT_VERSION=v0.2.3
RUN curl -fsSvLo linkerd-await https://github.com/linkerd/linkerd-await/releases/download/release%2F${LINKERD_AWAIT_VERSION}/linkerd-await-${LINKERD_AWAIT_VERSION}-${TARGETARCH} && chmod +x linkerd-await
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to have linkerd-await publish a docker image and then pull the binary from the docker image instead of from the releases page?

@kleimkuhler kleimkuhler marked this pull request as draft April 2, 2021 13:24
@kleimkuhler
Copy link
Contributor Author

Converting back to draft after some additional discussion off GH yesterday. I'll be removing the configuration options for this so that it is an always-enabled feature.

@electrical
Copy link

electrical commented Apr 7, 2021

Making the linkerd-proxy container the first (0) image might confuse people and potentially break behaviour for others.
Some tools might depend on the index of the image list for example.
Also in some UI's it shows the image value of the first container in the overview which then would show linkerd-proxy everywhere :-)

@adleong
Copy link
Member

adleong commented Apr 7, 2021

Thanks @electrical! Are there any specific examples you can point to of behaviors or UIs which rely on container ordering and would be negatively impacted by moving the linkerd-proxy container to index 0?

@electrical
Copy link

@adleong Rancher is a good example where in the different overview pages ( Pod, deployment, statefulsets, etc ) it shows the index 0 container. If it would show the linkerd-proxy container there I specifically need to go into the details of that item to see details about the container I actually care about and thus defeating the purpose of that overview.

In the case of other systems like Kyverno i assume with my own deployments that my container is in index 0 and do certain mutations with that. ( it can only do it based on index, not name )

These are the only 2 examples that I have / work with. Not sure if there are other systems that depend on the ordering.

@kleimkuhler
Copy link
Contributor Author

@electrical Thanks this is helpful! So those examples are assuming the index of the container, but not actually requiring it to be first.

For the Rancher case, I assume there is some way to configure it to either look at a different index or look by container name. For the Kyverno case, this again sounds like a scenario where looking by container name is a safer thing to do.

I say this because I think there is an important distinction between applications that require being the first container, and applications that are only assuming so. For this feature to work, the proxy requires that it is the first container.

As I stated above, right now the plan is to make this feature always-enabled and not configurable. Do you think if this was always-enabled but you could configure it, that you'd find yourself disabling it for those examples you listed?

If so, I think this may be a good reason to at least make this configurable in the first iteration. As it stays in edges though and we move closer to a next stable, maybe that will give more time for changes to applications that make these assumptions.

@electrical
Copy link

@kleimkuhler In my case I would disable it yeah. Partially for the fact the linkerd-proxy becomes index 0 but also I'm not sure linkerd should do this.
In my own applications for example everything works fine except for specific apps that need to wait for the linkerd-proxy but I solve that with adding linkerd-await in my containers. Since it's only an issue for a small use case ( as far as I know ) do you really want such a major change?
I'm not specifically against it because I understand the reasoning to improve user experience, but not being able to disable this behaviour would make it a negative experience for me.

@kleimkuhler
Copy link
Contributor Author

@electrical Yep that makes sense. The biggest case that is solved by making it always enabled is when users have some external image that they cannot—or do not wish to—build themselves so that it is wrapped by linkerd-await. Even then, the container may not actually need linkerd-await, but it shouldn't affect anything if it does use it.

If it is kept configurable, the other question is if it is enabled or disabled by default. For edges it's probably better to make it enabled by default. That way, we have more a chance of running into cases where this does affect a broader set of applications. Once we get closer to a stable and have a better idea about how helpful this feature is, we can make the call on what its default behavior is in the stable.

@electrical
Copy link

@kleimkuhler I completely understand and happy with the path you set out :-)

@kleimkuhler kleimkuhler marked this pull request as ready for review April 13, 2021 20:00
@kleimkuhler
Copy link
Contributor Author

This comment has been copied up into the PR description


There has been some additional discussion both off GitHub as well as on this PR (specifically with @electrical).

First, we decided that this feature should be enabled by default. The reason for this is more often than not, this feature will prevent start-up ordering issues from occurring without having any negative effects on the application. Additionally, this will be a part of edges up until the 2.11 (the next stable release) and having it enabled by default will allow us to check that it does not conflict often with applications. Once we are closer to 2.11, we'll be able to determine if this should be disabled by default because it causes more issues than it prevents.

Second, this feature will remain configurable; if disabled, then upon injection the proxy container will not be made the first container in the pod manifest. This is important for the reasons discussed with @electrical about tools that make assumptions about app containers being the first container. For example, Rancher defaults to showing overview pages for the 0 index container, and if the proxy container was always 0 then this would defeat the purpose of the overview page.

@kleimkuhler kleimkuhler requested a review from adleong April 13, 2021 20:01
@kleimkuhler kleimkuhler self-assigned this Apr 13, 2021
@kleimkuhler kleimkuhler added this to the stable-2.11.0 milestone Apr 13, 2021
{{- $r := merge .Values.publicAPIProxyResources .Values.proxy.resources }}
{{- $_ := set $tree.Values.proxy "resources" $r }}
{{- end }}
{{- $_ := set $tree.Values.proxy "await" true }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because core components are not admitted by the proxy-injector, we cannot rely on annotating these components with config.linkerd.io/proxy-await: "enabled".

Therefore, the template must override Values here so that proxy.await is always true. This ensures that when users install Linkerd and explicitly disable this feature for their application, the control plane still has the feature enabled.

This is true for templates/{destination.yaml, proxy-injector.yaml, sp-validator.yaml}.

@kleimkuhler
Copy link
Contributor Author

What would be super cool is if @mateiidavid's #6002 merges before this. That way, the Viz and Jaeger extensions that currently have the config.linkerd.io/proxy-await annotation on all their deployments could change to just a single annotation on their namespace.

Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! 👍

TIOLI: I noted the identity workload's proxy container will have the post-start hook. The main container will be triggered first, so in theory linkerd-await shouldn't block anything. But it might be worth adding {{- $_ := set $tree.Values.proxy "await" false }} in that case before calling the proxy partial, just to avoid the unnecessary call 🤷‍♂️

@kleimkuhler
Copy link
Contributor Author

@alpeb Good call; it's safer to explicitly disable the hook rather than relying off the container ordering. It has been added.

Copy link
Member

@adleong adleong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor mismatch with the annotation boolean format. Otherwise looks good!

@kleimkuhler kleimkuhler requested a review from adleong April 19, 2021 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants