@@ -127,6 +127,9 @@ The `helm` chart for the latest release of the plugin (`v0.7.0-rc.2`) includes
127
127
a number of customizable values. The most commonly overridden ones are :
128
128
129
129
` ` `
130
+ failOnInitError:
131
+ fail the plugin if an error is encountered during initialization, otherwise block indefinitely
132
+ (default 'true')
130
133
compatWithCPUManager:
131
134
run with escalated privileges to be compatible with the static CPUManager policy
132
135
(default 'false')
@@ -138,6 +141,18 @@ a number of customizable values. The most commonly overridden ones are:
138
141
[none | single | mixed] (default "none)
139
142
` ` `
140
143
144
+ When set to true, the `failOnInitError` flag fails the plugin if an error is
145
+ encountered during initialization. When set to false, it prints an error
146
+ message and blocks the plugin indefinitely instead of failing. Blocking
147
+ indefinitely follows legacy semantics that allow the plugin to deploy
148
+ successfully on nodes that don't have GPUs on them (and aren't supposed to have
149
+ GPUs on them) without throwing an error. In this way, you can blindly deploy a
150
+ daemonset with the plugin on all nodes in your cluster, whether they have GPUs
151
+ on them or not, without encountering an error. However, doing so means that
152
+ there is no way to detect an actual error on nodes that are supposed to have
153
+ GPUs on them. Failing if an initilization error is encountered is now the
154
+ default and should be adopted by all new deployments.
155
+
141
156
The `compatWithCPUManager` flag configures the daemonset to be able to
142
157
interoperate with the static `CPUManager` of the `kubelet`. Setting this flag
143
158
requires one to deploy the daemonset with elevated privileges, so only do so if
0 commit comments