Summary
AuthProxyWorkload status appears to grow without bound when spec.workloadSelector.kind: Pod matches short-lived pods with unique names. The controller replaces an existing status entry only when (name, namespace, kind, version) matches; otherwise it appends a new entry. I could not find pruning of status entries for pods that no longer exist.
For high-churn workloads such as workflow/batch pods, deleted pod names therefore remain in status.WorkloadStatus permanently. Over time the APW object can become large enough that status updates fail with Kubernetes/etcd request-size errors, leaving reconcile status stale and preventing spec changes from being reflected in status.
Observed impact
A Pod-selector APW matching ephemeral workflow pods accumulated thousands of stale pod entries and grew to roughly megabyte-scale status. Reconcile attempts then failed with errors of the form:
etcdserver: request is too large
trying to send message larger than max (... vs. 2097152)
Clearing status.WorkloadStatus allowed reconcile to succeed again, but deleting completed pods alone did not remove the stale status entries, so the object would grow again with future pod churn.
Expected behavior
The controller should bound APW status growth for kind: Pod selectors, for example by pruning status entries that no longer correspond to currently matching live workloads, compacting status for Pod selectors, or documenting/guarding against using Pod selectors for high-churn pods.
Notes
This is most visible for kind: Pod selectors because pod names are often unique and short lived. The same status model is less likely to grow unbounded for stable controller objects such as Deployments or StatefulSets.
I checked releases through v1.7.10 and did not find a change that appears to address this behavior.
Summary
AuthProxyWorkloadstatus appears to grow without bound whenspec.workloadSelector.kind: Podmatches short-lived pods with unique names. The controller replaces an existing status entry only when(name, namespace, kind, version)matches; otherwise it appends a new entry. I could not find pruning of status entries for pods that no longer exist.For high-churn workloads such as workflow/batch pods, deleted pod names therefore remain in
status.WorkloadStatuspermanently. Over time the APW object can become large enough that status updates fail with Kubernetes/etcd request-size errors, leaving reconcile status stale and preventing spec changes from being reflected in status.Observed impact
A Pod-selector APW matching ephemeral workflow pods accumulated thousands of stale pod entries and grew to roughly megabyte-scale status. Reconcile attempts then failed with errors of the form:
Clearing
status.WorkloadStatusallowed reconcile to succeed again, but deleting completed pods alone did not remove the stale status entries, so the object would grow again with future pod churn.Expected behavior
The controller should bound APW status growth for
kind: Podselectors, for example by pruning status entries that no longer correspond to currently matching live workloads, compacting status for Pod selectors, or documenting/guarding against using Pod selectors for high-churn pods.Notes
This is most visible for
kind: Podselectors because pod names are often unique and short lived. The same status model is less likely to grow unbounded for stable controller objects such as Deployments or StatefulSets.I checked releases through v1.7.10 and did not find a change that appears to address this behavior.