Webhook cert renewal failing see history edit this page

Talks about: , , , and

Symptom

stageset_webhook_cert_renewal_failures_total is increasing; the StageSetWebhookCertRenewalFailing alert fires (see operations for the alert set and its thresholds). The current certificate keeps working until its natural expiry — that expiry is the deadline, after which cluster-wide StageSet admission breaks.

Cause

Only applies in --webhook-cert-mode=self-signed. The in-pod renewer regenerates the serving cert every validity/3 and patches the ValidatingWebhookConfiguration’s caBundle. It fails when:

In cert-manager mode this metric is irrelevant — cert-manager owns renewal.

Diagnosis

kubectl -n stageset-system logs deploy/stageset-controller | grep -i 'cert\|renew\|caBundle'
kubectl get validatingwebhookconfiguration <name> -o jsonpath='{.webhooks[*].clientConfig.caBundle}' | head -c 40

Remediation