<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>StageSet Controller</title><link>https://stageset.projects.metio.wtf/</link><description>Recent content on StageSet Controller</description><generator>Hugo</generator><language>en</language><atom:link href="https://stageset.projects.metio.wtf/index.xml" rel="self" type="application/rss+xml"/><item><title>Actions</title><link>https://stageset.projects.metio.wtf/usage/actions/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/usage/actions/</guid><description>&lt;p&gt;Actions are typed steps the controller runs around a stage&amp;rsquo;s apply. They turn an
ordered apply into an orchestrated rollout — run a migration before the app, gate
the stage on an external check, clean up on failure.&lt;/p&gt;
&lt;p&gt;A stage has three action hooks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;pre&lt;/code&gt;&lt;/strong&gt; — run before the manifests are built and applied. A failure aborts the
stage with nothing applied.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;post&lt;/code&gt;&lt;/strong&gt; — run after the apply is verified. The stage is &lt;code&gt;Ready&lt;/code&gt; only if these
all succeed.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;onFailure&lt;/code&gt;&lt;/strong&gt; — best-effort steps run on any failure from the apply onward.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each action has a &lt;code&gt;name&lt;/code&gt;, optional &lt;code&gt;timeout&lt;/code&gt; and &lt;code&gt;retries&lt;/code&gt;, and &lt;strong&gt;exactly one&lt;/strong&gt;
operation type (&lt;code&gt;patch&lt;/code&gt;, &lt;code&gt;http&lt;/code&gt;, &lt;code&gt;wait&lt;/code&gt;, &lt;code&gt;job&lt;/code&gt;, &lt;code&gt;delete&lt;/code&gt;, or &lt;code&gt;apply&lt;/code&gt;) — enforced
by the validating admission webhook. Actions within a hook run in list order.&lt;/p&gt;</description></item><item><title>ArtifactNotFound</title><link>https://stageset.projects.metio.wtf/runbooks/artifactnotfound/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/artifactnotfound/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=False&lt;/code&gt;, &lt;code&gt;REASON=ArtifactNotFound&lt;/code&gt;. Transient: the controller requeues in case the artifact appears.&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;A stage&amp;rsquo;s &lt;code&gt;sourceRef&lt;/code&gt; resolves to &lt;strong&gt;no &lt;code&gt;ExternalArtifact&lt;/code&gt;&lt;/strong&gt;. Either:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a &lt;strong&gt;direct&lt;/strong&gt; &lt;code&gt;sourceRef&lt;/code&gt; (&lt;code&gt;kind: ExternalArtifact&lt;/code&gt;, the default) names an object that does not exist in the target namespace; or&lt;/li&gt;
&lt;li&gt;a &lt;strong&gt;producer&lt;/strong&gt; &lt;code&gt;sourceRef&lt;/code&gt; (e.g. &lt;code&gt;kind: JsonnetSnippet&lt;/code&gt;) exists, but no &lt;code&gt;ExternalArtifact&lt;/code&gt; carries a &lt;code&gt;spec.sourceRef&lt;/code&gt; back-pointer to it yet — the producer has not created its artifact object.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="diagnosis"&gt;Diagnosis&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl describe stageset &amp;lt;name&amp;gt; -n &amp;lt;namespace&amp;gt; &lt;span class="c1"&gt;# Message names the missing ref&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl get externalartifact -n &amp;lt;namespace&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For a producer ref, confirm the producer object exists and that it is configured to publish an &lt;code&gt;ExternalArtifact&lt;/code&gt; (not only serve over HTTP):&lt;/p&gt;</description></item><item><title>Building and testing</title><link>https://stageset.projects.metio.wtf/contributing/building/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/contributing/building/</guid><description>&lt;p&gt;The controller is a standard Go module. With a Go toolchain installed:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;go build ./...
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;go &lt;span class="nb"&gt;test&lt;/span&gt; -race -cover ./...
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="test-layers"&gt;Test layers&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Unit tests&lt;/strong&gt; sit next to the code across &lt;code&gt;internal/...&lt;/code&gt; and &lt;code&gt;api/v1/&lt;/code&gt;. Several
are drift gates — e.g. &lt;code&gt;conditions_test.go&lt;/code&gt; asserts every Ready &lt;code&gt;Reason&lt;/code&gt; has a
matching runbook page under &lt;code&gt;docs/content/runbooks/&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;envtest-backed tests&lt;/strong&gt; (&lt;code&gt;envtest_*_test.go&lt;/code&gt;) boot a real kube-apiserver + etcd
via controller-runtime&amp;rsquo;s &lt;code&gt;envtest&lt;/code&gt;. They &lt;code&gt;t.Skip&lt;/code&gt; unless &lt;code&gt;KUBEBUILDER_ASSETS&lt;/code&gt;
points at an asset bundle — install it with
&lt;a href="https://book.kubebuilder.io/reference/envtest.html"&gt;&lt;code&gt;setup-envtest&lt;/code&gt;&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fuzz tests&lt;/strong&gt; (&lt;code&gt;FuzzXxx&lt;/code&gt;) harden the parsing-heavy paths; their seed corpus runs
as ordinary unit tests, and &lt;code&gt;-fuzz&lt;/code&gt; fuzzes for real.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Kind smoke&lt;/strong&gt; scenarios under &lt;code&gt;hack/smoke/&lt;/code&gt; run the controller end to end
against a real kind cluster.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="static-analysis"&gt;Static analysis&lt;/h2&gt;
&lt;p&gt;A pull request must be clean under each of these — run them locally before
pushing:&lt;/p&gt;</description></item><item><title>CI and releases</title><link>https://stageset.projects.metio.wtf/contributing/ci-and-release/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/contributing/ci-and-release/</guid><description>&lt;h2 id="continuous-integration"&gt;Continuous integration&lt;/h2&gt;
&lt;p&gt;Every pull request runs &lt;code&gt;verify.yml&lt;/code&gt;, which fans out into one job per concern so a
failure points straight at the cause:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;test&lt;/strong&gt; — &lt;code&gt;go build&lt;/code&gt; then the full &lt;code&gt;go test&lt;/code&gt; suite.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;lint-go&lt;/strong&gt; — &lt;code&gt;go vet&lt;/code&gt;, &lt;code&gt;staticcheck&lt;/code&gt;, &lt;code&gt;gosec&lt;/code&gt;, and a &lt;code&gt;gofumpt&lt;/code&gt; formatting check.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;vulnerabilities&lt;/strong&gt; — &lt;code&gt;govulncheck&lt;/code&gt; (a reachable advisory is a hard gate).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;architecture&lt;/strong&gt; — &lt;code&gt;arch-go&lt;/code&gt; against &lt;code&gt;arch-go.yml&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;reuse&lt;/strong&gt; — SPDX/REUSE compliance on every file.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;text linters&lt;/strong&gt; — &lt;code&gt;yamllint&lt;/code&gt;, &lt;code&gt;actionlint&lt;/code&gt;, &lt;code&gt;markdownlint&lt;/code&gt;, &lt;code&gt;typos&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;container-image&lt;/strong&gt; — a buildx image build plus a Trivy scan.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A single &lt;strong&gt;all-green&lt;/strong&gt; job depends on every other job and is the only required
check, so new jobs are covered automatically. A separate &lt;code&gt;kind-smoke.yml&lt;/code&gt; runs the
operator end to end against a real kind cluster, and &lt;code&gt;fuzz.yml&lt;/code&gt; exercises the fuzz
targets.&lt;/p&gt;</description></item><item><title>Conflict policies</title><link>https://stageset.projects.metio.wtf/usage/conflict-policies/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/usage/conflict-policies/</guid><description>&lt;p&gt;Conflict policies decide what happens when an apply hits an immutable-field
conflict — a changed &lt;code&gt;clusterIP&lt;/code&gt;, a &lt;code&gt;Job&lt;/code&gt; pod template, a &lt;code&gt;StorageClass&lt;/code&gt; field
that can&amp;rsquo;t be updated in place. By default the controller fails the stage and
reports it, so nothing destructive happens by surprise. A policy opts specific
resources into automatic resolution.&lt;/p&gt;
&lt;h2 id="the-three-actions"&gt;The three actions&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Fail&lt;/code&gt; — stop and report (the default; safest).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Recreate&lt;/code&gt; — delete and re-create the object to get past an immutable-field
change.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;KeepExisting&lt;/code&gt; — leave the live object as-is and move on.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="a-default-for-the-whole-stage"&gt;A default for the whole stage&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;stages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;sourceRef&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;my-app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;conflictPolicy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Fail &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# explicit; the safe default&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;code&gt;force: true&lt;/code&gt; shorthand on a stage is equivalent to
&lt;code&gt;conflictPolicy.default: Recreate&lt;/code&gt;.&lt;/p&gt;</description></item><item><title>Controller pod down</title><link>https://stageset.projects.metio.wtf/runbooks/controller-pod-down/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/controller-pod-down/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;A &lt;code&gt;stageset-controller&lt;/code&gt; pod is &lt;code&gt;NotReady&lt;/code&gt;; the &lt;code&gt;StageSetControllerPodDown&lt;/code&gt; alert
fires. While no replica is Ready, StageSets are not reconciled and the
&lt;a href="https://kubernetes.io/docs/"&gt;Kubernetes&lt;/a&gt; admission webhook may reject &lt;code&gt;StageSet&lt;/code&gt;
writes (&lt;code&gt;failurePolicy: Fail&lt;/code&gt;).&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;a crash-looping container (bad config flag, missing RBAC, panic),&lt;/li&gt;
&lt;li&gt;the node draining or out of resources,&lt;/li&gt;
&lt;li&gt;a failing readiness probe (&lt;code&gt;/readyz&lt;/code&gt; on &lt;code&gt;--health-probe-bind-address&lt;/code&gt;),&lt;/li&gt;
&lt;li&gt;the leader-election lease unobtainable.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="diagnosis"&gt;Diagnosis&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl -n stageset-system get pods -l app.kubernetes.io/name&lt;span class="o"&gt;=&lt;/span&gt;stageset-controller
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl -n stageset-system describe pod &amp;lt;pod&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl -n stageset-system logs &amp;lt;pod&amp;gt; --previous --tail&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Look for flag-parse errors at startup, RBAC &lt;code&gt;Forbidden&lt;/code&gt; on the controller&amp;rsquo;s own
&lt;code&gt;ServiceAccount&lt;/code&gt;, or OOMKills.&lt;/p&gt;</description></item><item><title>DependencyNotReady</title><link>https://stageset.projects.metio.wtf/runbooks/dependencynotready/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/dependencynotready/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=False&lt;/code&gt;, &lt;code&gt;REASON=DependencyNotReady&lt;/code&gt;. Transient: the controller requeues at &lt;code&gt;spec.retryInterval&lt;/code&gt; (or &lt;code&gt;spec.interval&lt;/code&gt;).&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;A StageSet listed in &lt;code&gt;spec.dependsOn&lt;/code&gt; is not &lt;code&gt;Ready&lt;/code&gt; at its observed generation, so this StageSet holds before doing any work. Semantics match kustomize-controller: a dependency is satisfied only when its &lt;code&gt;Ready=True&lt;/code&gt; &lt;strong&gt;and&lt;/strong&gt; its &lt;code&gt;status.observedGeneration&lt;/code&gt; equals its current generation (so a freshly-edited dependency mid-reconcile does not count as ready).&lt;/p&gt;
&lt;h2 id="diagnosis"&gt;Diagnosis&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl describe stageset &amp;lt;name&amp;gt; -n &amp;lt;namespace&amp;gt; &lt;span class="c1"&gt;# Message names the dependency&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl get stageset &amp;lt;dependency&amp;gt; -n &amp;lt;namespace&amp;gt; &lt;span class="c1"&gt;# is it Ready?&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl describe stageset &amp;lt;dependency&amp;gt; -n &amp;lt;namespace&amp;gt; &lt;span class="c1"&gt;# why not?&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="remediation"&gt;Remediation&lt;/h2&gt;
&lt;p&gt;Resolve the dependency&amp;rsquo;s own Ready condition first (follow its runbook). Once it reports &lt;code&gt;Ready=True&lt;/code&gt; at its current generation, this StageSet proceeds on the next reconcile. If the dependency is intentionally &lt;a href="https://stageset.projects.metio.wtf/runbooks/suspended/"&gt;suspended&lt;/a&gt;, this StageSet waits indefinitely by design — remove the &lt;code&gt;dependsOn&lt;/code&gt; entry or resume the dependency.&lt;/p&gt;</description></item><item><title>DowngradeRequiresMigration</title><link>https://stageset.projects.metio.wtf/runbooks/downgraderequiresmigration/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/downgraderequiresmigration/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=False&lt;/code&gt;, &lt;code&gt;REASON=DowngradeRequiresMigration&lt;/code&gt;. Terminal: the run does not requeue until the desired version is at or above &lt;code&gt;status.version&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;The desired version (&lt;code&gt;spec.version&lt;/code&gt;) is &lt;strong&gt;lower&lt;/strong&gt; than the version the controller last recorded as deployed (&lt;code&gt;status.version&lt;/code&gt;). Downgrades are refused by default: &lt;a href="https://stageset.projects.metio.wtf/usage/versioned-migrations/"&gt;migrations&lt;/a&gt; are forward-only action ladders, and replaying upgrade migrations in reverse is how data gets destroyed. The controller does not silently run a downgrade.&lt;/p&gt;
&lt;h2 id="diagnosis"&gt;Diagnosis&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl describe stageset &amp;lt;name&amp;gt; -n &amp;lt;namespace&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl get stageset &amp;lt;name&amp;gt; -n &amp;lt;namespace&amp;gt; -o &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{.status.version}&amp;#39;&lt;/span&gt; &lt;span class="c1"&gt;# deployed&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# desired: read spec.version.value, or the version file the artifact carries&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="remediation"&gt;Remediation&lt;/h2&gt;
&lt;p&gt;Pick the intended direction:&lt;/p&gt;</description></item><item><title>From Jsonnet to a gated rollout</title><link>https://stageset.projects.metio.wtf/tutorials/jsonnet-to-rollout/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/tutorials/jsonnet-to-rollout/</guid><description>&lt;p&gt;This tutorial follows a complete delivery: write &lt;a href="https://kubernetes.io/docs/"&gt;Kubernetes&lt;/a&gt;
manifests in &lt;a href="https://jsonnet.org/"&gt;Jsonnet&lt;/a&gt; and publish the source through
&lt;a href="https://fluxcd.io/"&gt;Flux&lt;/a&gt;; &lt;a href="https://jaas.projects.metio.wtf/"&gt;JaaS&lt;/a&gt; renders it into a
Flux &lt;code&gt;ExternalArtifact&lt;/code&gt;, and a StageSet rolls it out with a readiness gate.&lt;/p&gt;
&lt;p&gt;The chain is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Jsonnet in Git/OCI/Bucket → JaaS (JsonnetSnippet) → ExternalArtifact → StageSet
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This tutorial renders &lt;em&gt;Jsonnet&lt;/em&gt;, so it goes through JaaS: JaaS turns the Jsonnet
into an &lt;code&gt;ExternalArtifact&lt;/code&gt; the stage consumes. (If your manifests were already plain
YAML, a stage could read a &lt;code&gt;GitRepository&lt;/code&gt;/&lt;code&gt;OCIRepository&lt;/code&gt;/&lt;code&gt;Bucket&lt;/code&gt; directly — see
&lt;a href="https://stageset.projects.metio.wtf/tutorials/flux-sources/"&gt;Stage sources&lt;/a&gt;. The renderer is here because the input is
Jsonnet, not because StageSet can&amp;rsquo;t read Git.)&lt;/p&gt;</description></item><item><title>Install on Kubernetes</title><link>https://stageset.projects.metio.wtf/installation/kubernetes/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/installation/kubernetes/</guid><description>&lt;h2 id="prerequisites"&gt;Prerequisites&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;A &lt;a href="https://kubernetes.io/docs/"&gt;Kubernetes&lt;/a&gt; cluster with &lt;code&gt;kubectl&lt;/code&gt; and
&lt;a href="https://helm.sh/"&gt;&lt;code&gt;helm&lt;/code&gt;&lt;/a&gt; configured against it.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://fluxcd.io/"&gt;Flux&lt;/a&gt; &lt;code&gt;source-controller&lt;/code&gt;, specifically the
&lt;code&gt;ExternalArtifact&lt;/code&gt; API (&lt;code&gt;source.toolkit.fluxcd.io&lt;/code&gt;). A &lt;code&gt;StageSet&lt;/code&gt; stage always
resolves to an &lt;code&gt;ExternalArtifact&lt;/code&gt;, so the CRD must exist. &lt;code&gt;ExternalArtifact&lt;/code&gt;
lands in Flux &lt;strong&gt;v2.7.0&lt;/strong&gt;; install at least that version. The controller also
watches &lt;code&gt;GitRepository&lt;/code&gt;, &lt;code&gt;OCIRepository&lt;/code&gt;, and &lt;code&gt;Bucket&lt;/code&gt; sources for
producer-aware resolution.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cert-manager.io/"&gt;cert-manager&lt;/a&gt;, only if you choose the
&lt;code&gt;cert-manager&lt;/code&gt; webhook certificate mode. The chart defaults to &lt;code&gt;self-signed&lt;/code&gt;,
which provisions and rotates the admission webhook&amp;rsquo;s TLS in-process and needs
no cert-manager. See &lt;a href="https://stageset.projects.metio.wtf/installation/production/#admission-webhook-tls"&gt;production&lt;/a&gt;
for the trade-off.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="https://jaas.projects.metio.wtf/"&gt;JaaS&lt;/a&gt;, JOI, or any particular artifact
producer are not required to install the controller — those are sources of
&lt;code&gt;ExternalArtifact&lt;/code&gt;s, wired up per &lt;code&gt;StageSet&lt;/code&gt;.&lt;/p&gt;</description></item><item><title>InvalidSpec</title><link>https://stageset.projects.metio.wtf/runbooks/invalidspec/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/invalidspec/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=False&lt;/code&gt;, &lt;code&gt;REASON=InvalidSpec&lt;/code&gt;. The Message names the offending field or action. Terminal: the controller does not requeue until the spec changes.&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;The spec failed validation that the CRD schema cannot express cheaply, normally one of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;an &lt;strong&gt;action sets zero or more than one verb&lt;/strong&gt; — each action must set exactly one of &lt;code&gt;patch&lt;/code&gt;, &lt;code&gt;http&lt;/code&gt;, &lt;code&gt;wait&lt;/code&gt;, &lt;code&gt;job&lt;/code&gt;, &lt;code&gt;delete&lt;/code&gt;, &lt;code&gt;apply&lt;/code&gt; (see &lt;a href="https://stageset.projects.metio.wtf/usage/actions/"&gt;actions&lt;/a&gt;);&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;spec.migrations&lt;/code&gt; without &lt;code&gt;spec.version&lt;/code&gt;&lt;/strong&gt;, or a migration anchored to a stage name that does not exist (see &lt;a href="https://stageset.projects.metio.wtf/usage/versioned-migrations/"&gt;versioned migrations&lt;/a&gt;);&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;spec.version&lt;/code&gt; does not name exactly one source&lt;/strong&gt; — set one of &lt;code&gt;value&lt;/code&gt;, &lt;code&gt;fromObject&lt;/code&gt;, or &lt;code&gt;fromArtifact&lt;/code&gt;;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;spec.decryption.provider&lt;/code&gt; is not &lt;code&gt;sops&lt;/code&gt;&lt;/strong&gt;, or a &lt;code&gt;secretRef&lt;/code&gt; is given without a &lt;code&gt;name&lt;/code&gt; (see &lt;a href="https://stageset.projects.metio.wtf/usage/encryption/"&gt;encryption&lt;/a&gt;);&lt;/li&gt;
&lt;li&gt;an &lt;strong&gt;invalid update window&lt;/strong&gt; — a malformed &lt;code&gt;schedule&lt;/code&gt;, &lt;code&gt;duration&lt;/code&gt;, or &lt;code&gt;timeZone&lt;/code&gt; (see &lt;a href="https://stageset.projects.metio.wtf/usage/update-windows/"&gt;update windows&lt;/a&gt;).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The admission webhook normally rejects these at write time; seeing this on the object means the webhook was bypassed or disabled and the reconciler caught it.&lt;/p&gt;</description></item><item><title>InvalidVersion</title><link>https://stageset.projects.metio.wtf/runbooks/invalidversion/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/invalidversion/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=False&lt;/code&gt;, &lt;code&gt;REASON=InvalidVersion&lt;/code&gt;. Terminal: the run does not requeue until the spec or the version file is fixed.&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;A version &lt;code&gt;spec.version&lt;/code&gt; (or a migration boundary) could not be resolved to a parseable &lt;a href="https://semver.org/"&gt;semver&lt;/a&gt;. The controller refuses to proceed rather than deploy a half-versioned system — a system whose recorded version is unknown is worse for migrations than an unversioned one. The Message names which input failed. By version source:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;spec.version.value&lt;/code&gt;&lt;/strong&gt; — the inline string is not a semver.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;spec.version.fromObject&lt;/code&gt;&lt;/strong&gt; — the named stage doesn&amp;rsquo;t exist; the object (&lt;code&gt;kind&lt;/code&gt;/&lt;code&gt;name&lt;/code&gt;) isn&amp;rsquo;t in the stage&amp;rsquo;s rendered manifests; the &lt;code&gt;fieldPath&lt;/code&gt; is invalid JSONPath or resolves to empty; or the value read (by default the &lt;code&gt;app.kubernetes.io/version&lt;/code&gt; label) is missing or not a semver.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;spec.version.fromArtifact&lt;/code&gt;&lt;/strong&gt; — the named stage doesn&amp;rsquo;t exist; the file at &lt;code&gt;path&lt;/code&gt; is missing from the stage&amp;rsquo;s artifact, empty, or doesn&amp;rsquo;t parse as a semver.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;spec.version&lt;/code&gt; sets none&lt;/strong&gt; of &lt;code&gt;value&lt;/code&gt;/&lt;code&gt;fromObject&lt;/code&gt;/&lt;code&gt;fromArtifact&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;migration&amp;rsquo;s &lt;code&gt;to&lt;/code&gt; or &lt;code&gt;from&lt;/code&gt;&lt;/strong&gt; is not a valid semver.&lt;/li&gt;
&lt;li&gt;The recorded &lt;strong&gt;&lt;code&gt;status.version&lt;/code&gt;&lt;/strong&gt; is not a semver (corrupted status).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Common triggers across all of them: a &lt;code&gt;v&lt;/code&gt; prefix or trailing whitespace the parser rejects, or non-semver text (e.g. a Git SHA or a &lt;code&gt;latest&lt;/code&gt; tag) where a version was expected.&lt;/p&gt;</description></item><item><title>Multi-cluster and tenancy</title><link>https://stageset.projects.metio.wtf/usage/multi-cluster/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/usage/multi-cluster/</guid><description>&lt;p&gt;There are two ways to run the controller, and they map onto two different trust
models. Pick the one that matches your cluster:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Multi-tenant&lt;/strong&gt; — the controller holds no write access of its own and applies
every &lt;code&gt;StageSet&lt;/code&gt; impersonating that &lt;code&gt;StageSet&lt;/code&gt;&amp;rsquo;s &lt;code&gt;serviceAccountName&lt;/code&gt;. Each
tenant&amp;rsquo;s RBAC bounds what its releases can touch. This is the chart default.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Single-tenant&lt;/strong&gt; — the cluster has one operator, so per-tenant isolation buys
nothing. Run the controller under its own identity bound to &lt;code&gt;cluster-admin&lt;/code&gt; and
skip impersonation entirely — the model Flux&amp;rsquo;s &lt;code&gt;helm-controller&lt;/code&gt; uses in its
default install.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The two sections below set each one up. The optional
&lt;a href="#scoping-the-controller-to-a-namespace-set"&gt;watch scoping&lt;/a&gt; narrows &lt;em&gt;which&lt;/em&gt;
namespaces a multi-tenant controller sees.&lt;/p&gt;</description></item><item><title>Operations</title><link>https://stageset.projects.metio.wtf/installation/operations/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/installation/operations/</guid><description>&lt;h2 id="metrics"&gt;Metrics&lt;/h2&gt;
&lt;p&gt;The controller registers custom metrics on the controller-runtime registry, served
on &lt;code&gt;--metrics-bind-address&lt;/code&gt; (&lt;code&gt;:8080&lt;/code&gt;) alongside the standard
&lt;code&gt;controller_runtime_*&lt;/code&gt; and &lt;code&gt;workqueue_*&lt;/code&gt; series. Enable scraping with the chart&amp;rsquo;s
opt-in &lt;code&gt;ServiceMonitor&lt;/code&gt; (&lt;code&gt;metrics.serviceMonitor.enabled&lt;/code&gt;):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;# values.yaml&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metrics&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;serviceMonitor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# needs the Prometheus operator CRDs&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Metric&lt;/th&gt;
					&lt;th&gt;Type&lt;/th&gt;
					&lt;th&gt;Labels&lt;/th&gt;
					&lt;th&gt;Meaning&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;stageset_reconcile_total&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;counter&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;namespace&lt;/code&gt;, &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;reason&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Reconciles, by terminal Ready reason.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;stageset_stage_applied_total&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;counter&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;namespace&lt;/code&gt;, &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;stage&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Stages applied and verified.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;stageset_drift_corrected_total&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;counter&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;namespace&lt;/code&gt;, &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;stage&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Out-of-band drift re-asserted on a steady-state reconcile.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;stageset_update_deferred_total&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;counter&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;namespace&lt;/code&gt;, &lt;code&gt;name&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Rollouts held by a closed update window.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;stageset_webhook_cert_renewal_failures_total&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;counter&lt;/td&gt;
					&lt;td&gt;&lt;em&gt;(none)&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;Failed self-signed webhook cert renewals.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;stageset_stage_ready&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;gauge&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;namespace&lt;/code&gt;, &lt;code&gt;stageset&lt;/code&gt;, &lt;code&gt;stage&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;1&lt;/code&gt; when a stage is Ready, else &lt;code&gt;0&lt;/code&gt; — for metric-based &lt;a href="https://stageset.projects.metio.wtf/tutorials/progressive-delivery/#argo-rollouts"&gt;progressive delivery&lt;/a&gt;.&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="alerts"&gt;Alerts&lt;/h2&gt;
&lt;p&gt;The chart ships an opt-in &lt;code&gt;PrometheusRule&lt;/code&gt; with a starter alert set, gated on
&lt;code&gt;metrics.prometheusRule.enabled&lt;/code&gt; (requires the
&lt;a href="https://prometheus-operator.dev/"&gt;Prometheus operator&lt;/a&gt; CRDs). It covers the
custom &lt;code&gt;stageset_*&lt;/code&gt; metrics plus controller-runtime signals:&lt;/p&gt;</description></item><item><title>Parameterizing a rollout</title><link>https://stageset.projects.metio.wtf/tutorials/parameters/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/tutorials/parameters/</guid><description>&lt;p&gt;A rollout takes parameters at two distinct layers, which serve different purposes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Render-time parameters (JaaS).&lt;/strong&gt; Change &lt;em&gt;what gets rendered&lt;/em&gt;. The Jsonnet
computes its output from top-level arguments (&lt;code&gt;tlas&lt;/code&gt;) and external variables
(&lt;code&gt;externalVariables&lt;/code&gt;). Different values produce a different &lt;code&gt;ExternalArtifact&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Delivery-time parameters (StageSet &lt;code&gt;postBuild&lt;/code&gt;).&lt;/strong&gt; Inject values &lt;em&gt;into
already-rendered manifests&lt;/em&gt;, per stage, by string substitution — the same
mechanism Flux&amp;rsquo;s &lt;code&gt;kustomize-controller&lt;/code&gt; uses.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Use render-time parameters for structural logic; use delivery-time parameters to
stamp environment-specific values onto a shared artifact.&lt;/p&gt;</description></item><item><title>PreviousRevisionUnavailable</title><link>https://stageset.projects.metio.wtf/runbooks/previousrevisionunavailable/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/previousrevisionunavailable/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=False&lt;/code&gt;, &lt;code&gt;REASON=PreviousRevisionUnavailable&lt;/code&gt;. The StageSet has &lt;code&gt;spec.rollbackOnFailure&lt;/code&gt; set, a run failed, and the controller could not restore the last-good revisions.&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://stageset.projects.metio.wtf/usage/rollback/"&gt;&lt;code&gt;rollbackOnFailure&lt;/code&gt;&lt;/a&gt; restores the previously-applied artifact revisions by re-fetching their recorded URLs and verifying their digests. That only works while the &lt;strong&gt;producer still retains&lt;/strong&gt; those revisions. This reason means a revision the rollback needs is no longer fetchable — the producer garbage-collected it.&lt;/p&gt;
&lt;p&gt;Rollback is best-effort by contract: it works exactly when producers retain. Common triggers:&lt;/p&gt;</description></item><item><title>Producer-aware sources</title><link>https://stageset.projects.metio.wtf/usage/producer-aware-sources/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/usage/producer-aware-sources/</guid><description>&lt;p&gt;&lt;a href="https://stageset.projects.metio.wtf/usage/stages-and-sources/#source-kinds"&gt;Stages and sources&lt;/a&gt; covers the two
direct routes — an &lt;code&gt;ExternalArtifact&lt;/code&gt; (the default &lt;code&gt;sourceRef.kind&lt;/code&gt;) or a Flux
&lt;code&gt;GitRepository&lt;/code&gt;/&lt;code&gt;OCIRepository&lt;/code&gt;/&lt;code&gt;Bucket&lt;/code&gt;. This page covers the third: naming the
thing that &lt;em&gt;produces&lt;/em&gt; an artifact and letting the controller find it. This is useful
when an operator publishes an &lt;code&gt;ExternalArtifact&lt;/code&gt; from a custom resource (for example
&lt;a href="https://jaas.projects.metio.wtf/"&gt;JaaS&lt;/a&gt; rendering Jsonnet).&lt;/p&gt;
&lt;h2 id="referencing-a-producer"&gt;Referencing a producer&lt;/h2&gt;
&lt;p&gt;Set &lt;code&gt;kind&lt;/code&gt; (and &lt;code&gt;apiVersion&lt;/code&gt;) to a producer resource, and the controller resolves
it to the &lt;code&gt;ExternalArtifact&lt;/code&gt; that producer publishes — the one whose
&lt;code&gt;spec.sourceRef&lt;/code&gt; back-references the producer (matched on group, kind, and name).
For example, a &lt;a href="https://jaas.projects.metio.wtf/"&gt;JaaS&lt;/a&gt; &lt;code&gt;JsonnetSnippet&lt;/code&gt;
renders Jsonnet and publishes an &lt;code&gt;ExternalArtifact&lt;/code&gt;; reference the snippet and the
controller follows the link:&lt;/p&gt;</description></item><item><title>Production</title><link>https://stageset.projects.metio.wtf/installation/production/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/installation/production/</guid><description>&lt;h2 id="high-availability"&gt;High availability&lt;/h2&gt;
&lt;p&gt;The controller supports leader-elected HA. Enable leader election and run more
than one replica; only the lease holder reconciles, while every replica answers
admission webhook calls (admission must stay available even on non-leaders).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Leader election is toggled with &lt;code&gt;--leader-elect&lt;/code&gt;. The binary defaults it to
&lt;code&gt;false&lt;/code&gt;, but the &lt;strong&gt;Helm chart enables it by default&lt;/strong&gt; (&lt;code&gt;controller.leaderElect: true&lt;/code&gt;), so a default install is already lease-guarded even at one replica.&lt;/li&gt;
&lt;li&gt;The lease is named &lt;code&gt;stageset-controller.stages.metio.wtf&lt;/code&gt; and lives in the
controller&amp;rsquo;s namespace. It uses controller-runtime&amp;rsquo;s default timing (~15 s
lease duration). The lease is &lt;strong&gt;not&lt;/strong&gt; released eagerly on shutdown, so after a
rolling update the new leader takes over when the old lease expires — budget a
few seconds of reconcile pause on restart (admission and the gate endpoint are
unaffected).&lt;/li&gt;
&lt;li&gt;Scaling: when the chart&amp;rsquo;s &lt;code&gt;replicas.max&lt;/code&gt; exceeds &lt;code&gt;replicas.min&lt;/code&gt; it renders a
&lt;code&gt;HorizontalPodAutoscaler&lt;/code&gt; (CPU target 80%) and a &lt;code&gt;PodDisruptionBudget&lt;/code&gt;
(&lt;code&gt;minAvailable: 1&lt;/code&gt;). At the default 1/1 it sets neither and leaves
&lt;code&gt;spec.replicas&lt;/code&gt; unmanaged.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The controller watches every namespace by default. Multi-tenancy is enforced per
&lt;code&gt;StageSet&lt;/code&gt; through impersonation (see below). You can additionally scope the
controller to a namespace set with &lt;code&gt;controller.watchNamespaces&lt;/code&gt; — one controller
instance per tenant-group — and run it under &lt;code&gt;cluster-admin&lt;/code&gt; for single-tenant
clusters; both are covered in
&lt;a href="https://stageset.projects.metio.wtf/usage/multi-cluster/"&gt;multi-cluster and tenancy&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>Progressive delivery</title><link>https://stageset.projects.metio.wtf/tutorials/progressive-delivery/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/tutorials/progressive-delivery/</guid><description>&lt;p&gt;&lt;code&gt;StageSet&lt;/code&gt; integrates with both progressive-delivery controllers:
&lt;a href="https://flagger.app/"&gt;Flagger&lt;/a&gt; and
&lt;a href="https://argoproj.github.io/argo-rollouts/"&gt;Argo Rollouts&lt;/a&gt;. The controller exposes
a read-only gate endpoint and a readiness gauge so either one can hold a promotion
until a &lt;code&gt;StageSet&lt;/code&gt; stage is healthy; ready checks let a stage wait on a Rollout in
return. Pick the section for your controller below — see also
&lt;a href="https://stageset.projects.metio.wtf/comparisons/argo-rollouts/"&gt;StageSet vs Argo Rollouts&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="the-gate-contract"&gt;The gate contract&lt;/h2&gt;
&lt;p&gt;The gate endpoint backs the Flagger integration and the Argo Rollouts JSON-metric
option.&lt;/p&gt;</description></item><item><title>Ready checks</title><link>https://stageset.projects.metio.wtf/usage/ready-checks/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/usage/ready-checks/</guid><description>&lt;p&gt;Ready checks decide when a stage is healthy enough to let the next stage start.
They are purely observational — the controller waits and reports, but takes no
action (active steps are &lt;a href="https://stageset.projects.metio.wtf/usage/actions/"&gt;actions&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;By default, with no &lt;code&gt;readyChecks&lt;/code&gt; block, the controller waits for &lt;strong&gt;every&lt;/strong&gt; object
the stage applied to report ready via
&lt;a href="https://github.com/kubernetes-sigs/cli-utils/tree/master/pkg/kstatus"&gt;kstatus&lt;/a&gt;.
&lt;code&gt;readyChecks&lt;/code&gt; lets you narrow that to specific objects (&lt;code&gt;checks&lt;/code&gt;), add custom
health for resources kstatus doesn&amp;rsquo;t understand (&lt;code&gt;exprs&lt;/code&gt;, &lt;a href="https://github.com/google/cel-spec"&gt;CEL&lt;/a&gt;),
bound the wait (&lt;code&gt;timeout&lt;/code&gt;), or skip it entirely (&lt;code&gt;disableWait&lt;/code&gt;). &lt;code&gt;checks&lt;/code&gt; and
&lt;code&gt;exprs&lt;/code&gt; may be set together.&lt;/p&gt;</description></item><item><title>Reconcile latency high</title><link>https://stageset.projects.metio.wtf/runbooks/reconcile-latency/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/reconcile-latency/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;controller_runtime_reconcile_time_seconds&lt;/code&gt; p99 for &lt;code&gt;controller=&amp;quot;stageset&amp;quot;&lt;/code&gt; exceeds
the configured threshold; the &lt;code&gt;StageSetReconcileLatencyHigh&lt;/code&gt; alert fires (see
&lt;a href="https://stageset.projects.metio.wtf/installation/operations/"&gt;operations&lt;/a&gt; for the alert set and its thresholds).&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;A single reconcile does a lot of work — resolve and fetch every stage&amp;rsquo;s artifact,
kustomize-build, server-side apply, prune, verify readiness, and run actions — all
impersonating the tenant &lt;code&gt;ServiceAccount&lt;/code&gt;. Latency climbs when any of those is slow:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;large artifacts or slow artifact servers,&lt;/li&gt;
&lt;li&gt;many objects per stage (apply + prune scale with object count),&lt;/li&gt;
&lt;li&gt;readiness waits and &lt;code&gt;wait&lt;/code&gt;/&lt;code&gt;http&lt;/code&gt;/&lt;code&gt;job&lt;/code&gt; actions with long timeouts,&lt;/li&gt;
&lt;li&gt;apiserver or tenant-authorization slowness.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="diagnosis"&gt;Diagnosis&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl -n stageset-system logs deploy/stageset-controller --tail&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; grep -i &lt;span class="s1"&gt;&amp;#39;slow\|timeout\|took&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Break the latency down by stage count and artifact size; a single StageSet with
many large stages dominates p99.&lt;/p&gt;</description></item><item><title>ResolveFailed</title><link>https://stageset.projects.metio.wtf/runbooks/resolvefailed/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/resolvefailed/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=False&lt;/code&gt;, &lt;code&gt;REASON=ResolveFailed&lt;/code&gt;. The Message describes why resolution failed.&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;A stage&amp;rsquo;s &lt;code&gt;sourceRef&lt;/code&gt; could not be resolved to an &lt;code&gt;ExternalArtifact&lt;/code&gt; for a spec/config or API reason (distinct from &amp;ldquo;not published yet&amp;rdquo;, which is &lt;a href="https://stageset.projects.metio.wtf/runbooks/sourcenotready/"&gt;&lt;code&gt;SourceNotReady&lt;/code&gt;&lt;/a&gt;, and &amp;ldquo;no such object&amp;rdquo;, which is &lt;a href="https://stageset.projects.metio.wtf/runbooks/artifactnotfound/"&gt;&lt;code&gt;ArtifactNotFound&lt;/code&gt;&lt;/a&gt;). Common cases:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;an &lt;strong&gt;ambiguous producer&lt;/strong&gt; — more than one &lt;code&gt;ExternalArtifact&lt;/code&gt; back-points at the same producer object, so the target is undefined;&lt;/li&gt;
&lt;li&gt;a &lt;strong&gt;cross-namespace ref rejected&lt;/strong&gt; by &lt;code&gt;--no-cross-namespace-refs&lt;/code&gt;;&lt;/li&gt;
&lt;li&gt;an &lt;strong&gt;API error&lt;/strong&gt; reading the source or artifact (RBAC denial, the artifact CRD not installed).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When the failing &lt;code&gt;sourceRef&lt;/code&gt; targets another namespace, the Message is deliberately scrubbed to &lt;code&gt;cross-namespace &amp;lt;kind&amp;gt; %q is not reachable&lt;/code&gt; so tenants cannot fingerprint other namespaces — check that source CR&amp;rsquo;s status in its own namespace.&lt;/p&gt;</description></item><item><title>Rollback</title><link>https://stageset.projects.metio.wtf/usage/rollback/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/usage/rollback/</guid><description>&lt;p&gt;When a run fails, the controller can restore the last successfully-applied artifact
revisions instead of leaving you on a broken release. Rollback is opt-in and needs
somewhere to keep prior revisions.&lt;/p&gt;
&lt;h2 id="enabling-it"&gt;Enabling it&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;rollbackOnFailure&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;stages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;sourceRef&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;my-app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;On a failed run the controller restores each stage&amp;rsquo;s last-good artifact revision,
best-effort, and emits a &lt;code&gt;RolledBack&lt;/code&gt; event. The coordinates it restores from are
recorded in &lt;code&gt;status.lastAppliedSnapshot&lt;/code&gt;.&lt;/p&gt;</description></item><item><title>Secrets encryption (SOPS)</title><link>https://stageset.projects.metio.wtf/usage/encryption/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/usage/encryption/</guid><description>&lt;p&gt;A stage&amp;rsquo;s source can carry &lt;a href="https://github.com/getsops/sops"&gt;SOPS&lt;/a&gt;-encrypted
files — typically a &lt;code&gt;Secret&lt;/code&gt; whose values are encrypted — and the controller
decrypts them in memory, before building and applying the manifests. This mirrors
Flux&amp;rsquo;s &lt;code&gt;kustomize-controller&lt;/code&gt; decryption contract, so an existing SOPS-encrypted
repository works unchanged.&lt;/p&gt;
&lt;p&gt;Set &lt;code&gt;spec.decryption&lt;/code&gt; and point it at a Secret holding the keys:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;stages.metio.wtf/v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;StageSet&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;payments&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;payments&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;serviceAccountName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;payments-deployer&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;decryption&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;sops &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# the only provider&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;secretRef&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;sops-age &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# a Secret in this namespace holding the age key&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;stages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;sourceRef&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;GitRepository&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;payments-config &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# contains an encrypted secret.yaml&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="walkthrough--age"&gt;Walkthrough — age&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://age-encryption.org/"&gt;age&lt;/a&gt; is the simplest key type and needs no external
service. Take a &lt;code&gt;Secret&lt;/code&gt; from plaintext to a GitOps-safe rollout in four steps.&lt;/p&gt;</description></item><item><title>SourceNotReady</title><link>https://stageset.projects.metio.wtf/runbooks/sourcenotready/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/sourcenotready/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=False&lt;/code&gt;, &lt;code&gt;REASON=SourceNotReady&lt;/code&gt;. Transient: the controller requeues and clears the condition once the source publishes.&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;A stage&amp;rsquo;s &lt;code&gt;sourceRef&lt;/code&gt; resolved to an &lt;code&gt;ExternalArtifact&lt;/code&gt; (directly, or via a producer&amp;rsquo;s RFC-0012 back-pointer such as a JaaS &lt;code&gt;JsonnetSnippet&lt;/code&gt;), but that artifact&amp;rsquo;s &lt;code&gt;status.conditions[Ready]&lt;/code&gt; is not yet &lt;code&gt;True&lt;/code&gt; — its producer has not finished publishing a revision. The StageSet gates on &lt;code&gt;Ready=True&lt;/code&gt; so it never builds against a half-written artifact.&lt;/p&gt;
&lt;h2 id="diagnosis"&gt;Diagnosis&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Which artifact, and is it Ready?&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl get externalartifact -n &amp;lt;namespace&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl describe externalartifact &amp;lt;name&amp;gt; -n &amp;lt;namespace&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# If the producer is a JsonnetSnippet (or other producer kind), check it:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl describe jsonnetsnippet &amp;lt;name&amp;gt; -n &amp;lt;namespace&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="remediation"&gt;Remediation&lt;/h2&gt;
&lt;p&gt;This usually clears on its own when the producer publishes. If it persists:&lt;/p&gt;</description></item><item><title>Stage sources — Git, OCI, Bucket</title><link>https://stageset.projects.metio.wtf/tutorials/flux-sources/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/tutorials/flux-sources/</guid><description>&lt;p&gt;A stage resolves its &lt;code&gt;sourceRef&lt;/code&gt; to a &lt;a href="https://fluxcd.io/"&gt;Flux&lt;/a&gt; artifact. You have
two routes:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;manifests in Git / OCI / Bucket ──────────────────────────► StageSet (direct)
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;manifests in Git / OCI / Bucket ──► a renderer (JaaS) ──► ExternalArtifact ──► StageSet
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Use the &lt;strong&gt;direct&lt;/strong&gt; route when the source already holds ready-to-apply manifests
(the same thing Flux&amp;rsquo;s &lt;code&gt;kustomize-controller&lt;/code&gt; consumes). Use the &lt;strong&gt;renderer&lt;/strong&gt; route
when you generate manifests first — e.g. evaluating Jsonnet with
&lt;a href="https://jaas.projects.metio.wtf/"&gt;JaaS&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is the copy-pasteable recipe per source kind. For how &lt;code&gt;sourceRef&lt;/code&gt;
resolution works as a concept — and the &lt;code&gt;path&lt;/code&gt;, &lt;code&gt;prune&lt;/code&gt;, &lt;code&gt;patches&lt;/code&gt;, and
&lt;code&gt;postBuild&lt;/code&gt; knobs that shape a stage — see
&lt;a href="https://stageset.projects.metio.wtf/usage/stages-and-sources/"&gt;stages and sources&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>StageFailed</title><link>https://stageset.projects.metio.wtf/runbooks/stagefailed/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/stagefailed/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=False&lt;/code&gt;, &lt;code&gt;REASON=StageFailed&lt;/code&gt;. The Message names the stage and the operation that failed (&lt;code&gt;fetch artifact&lt;/code&gt;, &lt;code&gt;build&lt;/code&gt;, &lt;code&gt;apply&lt;/code&gt;, &lt;code&gt;verify&lt;/code&gt;, a pre/post action, or &lt;code&gt;connect to target cluster&lt;/code&gt;). The run halts at that stage; later stages keep their previous revisions.&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;A stage failed during execution. By operation:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;fetch artifact&lt;/strong&gt; — the artifact URL was unreachable, or its bytes failed digest verification.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;build&lt;/strong&gt; — kustomize build or post-build substitution failed (a missing &lt;code&gt;substituteFrom&lt;/code&gt; source, an invalid patch, a malformed manifest).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;apply&lt;/strong&gt; — the server-side apply was rejected: an immutable-field conflict, or an &lt;strong&gt;RBAC denial&lt;/strong&gt; under the impersonated &lt;code&gt;serviceAccountName&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;verify&lt;/strong&gt; — applied objects did not become Ready within the stage timeout (kstatus).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;pre/post action&lt;/strong&gt; — a &lt;code&gt;patch&lt;/code&gt;/&lt;code&gt;http&lt;/code&gt;/&lt;code&gt;wait&lt;/code&gt;/&lt;code&gt;job&lt;/code&gt;/&lt;code&gt;delete&lt;/code&gt;/&lt;code&gt;apply&lt;/code&gt; action failed or timed out.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;connect to target cluster&lt;/strong&gt; — a &lt;code&gt;spec.kubeConfig&lt;/code&gt; Secret was missing, unparseable, or used the unsupported cloud-provider &lt;code&gt;configMapRef&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="diagnosis"&gt;Diagnosis&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl describe stageset &amp;lt;name&amp;gt; -n &amp;lt;namespace&amp;gt; &lt;span class="c1"&gt;# Message: which stage + operation&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl -n stageset-system logs deploy/stageset-controller --tail&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# For apply/verify failures, inspect what the stage tried to apply:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl get stageinventory -n &amp;lt;namespace&amp;gt; &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; -l stages.metio.wtf/stage-set&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;name&amp;gt;,stages.metio.wtf/stage&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;stage&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="remediation"&gt;Remediation&lt;/h2&gt;
&lt;p&gt;Match the operation in the Message:&lt;/p&gt;</description></item><item><title>StageInventory</title><link>https://stageset.projects.metio.wtf/api/stageinventory/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/api/stageinventory/</guid><description>&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;stages.metio.wtf/v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;StageInventory&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;A &lt;code&gt;StageInventory&lt;/code&gt; records the set of objects a single stage has applied, so the
controller can prune precisely and tear stages down in reverse order. &lt;strong&gt;You do not
author these&lt;/strong&gt; — the controller creates, updates, and deletes them. They are
documented here so you can read them when debugging and back them up.&lt;/p&gt;
&lt;p&gt;One stage may be backed by several &lt;code&gt;StageInventory&lt;/code&gt; shards once it exceeds
&lt;code&gt;--inventory-shard-cap&lt;/code&gt; entries (default 5000). Shard &lt;code&gt;0&lt;/code&gt; doubles as the ApplySet
(&lt;a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-cli/3659-kubectl-applyset"&gt;KEP-3659&lt;/a&gt;)
parent object for the stage.&lt;/p&gt;</description></item><item><title>Stages and sources</title><link>https://stageset.projects.metio.wtf/usage/stages-and-sources/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/usage/stages-and-sources/</guid><description>&lt;p&gt;A &lt;code&gt;StageSet&lt;/code&gt; is an ordered list of stages. Each stage resolves a
&lt;a href="https://fluxcd.io/"&gt;Flux&lt;/a&gt; source — a &lt;code&gt;GitRepository&lt;/code&gt;, &lt;code&gt;OCIRepository&lt;/code&gt;, &lt;code&gt;Bucket&lt;/code&gt;,
or an &lt;code&gt;ExternalArtifact&lt;/code&gt; (the default) — applies its manifests, waits for them to
become healthy, and only then lets the next stage start.&lt;/p&gt;
&lt;h2 id="one-stage"&gt;One stage&lt;/h2&gt;
&lt;p&gt;The minimum is one stage pointing at one artifact in the same namespace:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;stages.metio.wtf/v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;StageSet&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;my-app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;default&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;stages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;sourceRef&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;my-app &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# an ExternalArtifact&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;code&gt;sourceRef.kind&lt;/code&gt; defaults to &lt;code&gt;ExternalArtifact&lt;/code&gt;, so the common case is a single
line. The controller fetches the artifact, applies every manifest in it, and marks
the stage &lt;code&gt;Ready&lt;/code&gt; once the applied objects report healthy.&lt;/p&gt;</description></item><item><title>StageSet</title><link>https://stageset.projects.metio.wtf/api/stageset/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/api/stageset/</guid><description>&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;stages.metio.wtf/v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;StageSet&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;A &lt;code&gt;StageSet&lt;/code&gt; is a namespaced &lt;a href="https://kubernetes.io/docs/"&gt;Kubernetes&lt;/a&gt; resource
describing an ordered set of stages. Only &lt;code&gt;spec.stages&lt;/code&gt; is required; everything else
refines scheduling, security, gating, versioning, and rollback. Every field below is
shown in YAML at least once.&lt;/p&gt;
&lt;p&gt;The smallest valid StageSet:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;stages.metio.wtf/v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;StageSet&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;my-app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;default&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;stages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;sourceRef&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;my-app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;hr&gt;
&lt;h2 id="scheduling"&gt;Scheduling&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;interval: 5m # optional: reconcile cadence (default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--&lt;span class="l"&gt;default-interval)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;retryInterval: 1m # cadence after a failed run (default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;interval)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;driftDetectionInterval&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;2m &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# faster drift correction than interval (optional)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;5m &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# default per-stage timeout (optional)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;suspend&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# pause reconciliation without deleting (default false)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;interval&lt;/code&gt;&lt;/strong&gt; (optional) — steady-state reconcile cadence; each reconcile
re-resolves sources, re-asserts desired state (correcting drift), and prunes.
&lt;strong&gt;When omitted, the controller&amp;rsquo;s &lt;code&gt;--default-interval&lt;/code&gt; is used&lt;/strong&gt; (the chart&amp;rsquo;s
&lt;code&gt;controller.defaultInterval&lt;/code&gt;, default &lt;code&gt;10m&lt;/code&gt;), so most StageSets can leave it out.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;retryInterval&lt;/code&gt;&lt;/strong&gt; — retry cadence after a failure; falls back to &lt;code&gt;interval&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;driftDetectionInterval&lt;/code&gt;&lt;/strong&gt; — a shorter cadence dedicated to healing out-of-band
drift when you need it tighter than &lt;code&gt;interval&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;timeout&lt;/code&gt;&lt;/strong&gt; — how long any one stage may take before it fails; override per
stage with &lt;code&gt;stages[].timeout&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;suspend&lt;/code&gt;&lt;/strong&gt; — short-circuits to &lt;code&gt;Ready=False / Suspended&lt;/code&gt;, leaving applied state
running. Use &lt;a href="https://stageset.projects.metio.wtf/cli/reconcile/"&gt;&lt;code&gt;stagesetctl reconcile --force&lt;/code&gt;&lt;/a&gt; to run once while
suspended. See the &lt;a href="https://stageset.projects.metio.wtf/runbooks/suspended/"&gt;&lt;code&gt;Suspended&lt;/code&gt; runbook&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ordering-between-stagesets"&gt;Ordering between StageSets&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;dependsOn&lt;/code&gt; gates this StageSet on others being Ready at their observed generation
— cross-release ordering. (Ordering &lt;em&gt;within&lt;/em&gt; a StageSet is the order of &lt;code&gt;stages&lt;/code&gt;.)&lt;/p&gt;</description></item><item><title>StageSet vs Argo Rollouts</title><link>https://stageset.projects.metio.wtf/comparisons/argo-rollouts/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/comparisons/argo-rollouts/</guid><description>&lt;p&gt;&lt;a href="https://argoproj.github.io/argo-rollouts/"&gt;Argo Rollouts&lt;/a&gt; and &lt;code&gt;StageSet&lt;/code&gt; are easy
to mention in the same breath because both roll things out gradually, but they
operate at different layers and are complementary rather than competing.&lt;/p&gt;
&lt;h2 id="what-argo-rollouts-does"&gt;What Argo Rollouts does&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;Argo Rollouts&lt;/code&gt; replaces a &lt;code&gt;Deployment&lt;/code&gt; with a &lt;code&gt;Rollout&lt;/code&gt; that shifts traffic to a new
version &lt;strong&gt;progressively&lt;/strong&gt; — canary or blue-green — pausing at weighted steps and
promoting based on &lt;strong&gt;metric analysis&lt;/strong&gt; (Prometheus queries, web/Job providers).
Its unit of work is a &lt;strong&gt;single workload&amp;rsquo;s&lt;/strong&gt; version transition and the traffic in
front of it.&lt;/p&gt;</description></item><item><title>StageSet vs Flux kustomize-controller</title><link>https://stageset.projects.metio.wtf/comparisons/flux/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/comparisons/flux/</guid><description>&lt;p&gt;This is the closest comparison — &lt;code&gt;StageSet&lt;/code&gt; is built &lt;em&gt;for&lt;/em&gt;
&lt;a href="https://fluxcd.io/"&gt;Flux&lt;/a&gt; and borrows its conventions. Flux&amp;rsquo;s &lt;code&gt;kustomize-controller&lt;/code&gt;
(and &lt;code&gt;helm-controller&lt;/code&gt;) reconcile a source into the cluster continuously, exactly
like &lt;code&gt;StageSet&lt;/code&gt;. The difference is granularity.&lt;/p&gt;
&lt;h2 id="what-kustomize-controller-gives-you"&gt;What kustomize-controller gives you&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Continuous reconciliation of a &lt;code&gt;Kustomization&lt;/code&gt; from a Flux source, with pruning,
health checks, drift correction, and &lt;code&gt;dependsOn&lt;/code&gt; ordering &lt;strong&gt;between&lt;/strong&gt;
Kustomizations.&lt;/li&gt;
&lt;li&gt;Impersonation via &lt;code&gt;serviceAccountName&lt;/code&gt;, &lt;code&gt;postBuild&lt;/code&gt; substitution, patches — the
same surface &lt;code&gt;StageSet&lt;/code&gt; deliberately mirrors.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="where-stageset-differs"&gt;Where StageSet differs&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Ordering within a release.&lt;/strong&gt; &lt;code&gt;kustomize-controller&lt;/code&gt; applies one Kustomization
as a unit; ordering exists only &lt;em&gt;between&lt;/em&gt; Kustomizations via &lt;code&gt;dependsOn&lt;/code&gt;. To
sequence three steps you create three Kustomizations and wire their
dependencies. &lt;code&gt;StageSet&lt;/code&gt; expresses that as one resource with ordered &lt;code&gt;stages&lt;/code&gt; —
and the controller waits for each stage&amp;rsquo;s health before the next.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Typed actions between steps.&lt;/strong&gt; Migrations, HTTP gates, waits, and transient
applies are first-class &lt;a href="https://stageset.projects.metio.wtf/usage/actions/"&gt;actions&lt;/a&gt;; in plain Flux you&amp;rsquo;d model
these as extra Kustomizations and Jobs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Release-level features.&lt;/strong&gt; &lt;a href="https://stageset.projects.metio.wtf/usage/update-windows/"&gt;Update windows&lt;/a&gt;,
&lt;a href="https://stageset.projects.metio.wtf/usage/versioned-migrations/"&gt;versioned migrations&lt;/a&gt;, and
&lt;a href="https://stageset.projects.metio.wtf/usage/rollback/"&gt;rollback&lt;/a&gt; operate across the whole staged release.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source-native.&lt;/strong&gt; A stage consumes a &lt;code&gt;GitRepository&lt;/code&gt;/&lt;code&gt;OCIRepository&lt;/code&gt;/&lt;code&gt;Bucket&lt;/code&gt;
directly (just like &lt;code&gt;kustomize-controller&lt;/code&gt;), or an &lt;code&gt;ExternalArtifact&lt;/code&gt; (RFC-0012),
or a &lt;em&gt;producer&lt;/em&gt; resolved to its artifact — which is how it also pairs with
renderers like &lt;a href="https://jaas.projects.metio.wtf/"&gt;JaaS&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SOPS parity.&lt;/strong&gt; Encrypted Secrets in a source decrypt the same way, via
&lt;a href="https://stageset.projects.metio.wtf/usage/encryption/"&gt;&lt;code&gt;spec.decryption&lt;/code&gt;&lt;/a&gt; (age, PGP, or cloud KMS), so a SOPS-using
repo ports across unchanged.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="using-them-together"&gt;Using them together&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;StageSet&lt;/code&gt; sits alongside the other Flux controllers and reuses Flux&amp;rsquo;s source layer,
notifications (&lt;code&gt;Alert&lt;/code&gt;/&lt;code&gt;Provider&lt;/code&gt; targeting &lt;code&gt;kind: StageSet&lt;/code&gt;), and reconcile
annotations. Use &lt;code&gt;kustomize-controller&lt;/code&gt; for ordinary one-shot reconciliation and
reach for &lt;code&gt;StageSet&lt;/code&gt; when a release needs ordered, gated stages.&lt;/p&gt;</description></item><item><title>StageSet vs Helm</title><link>https://stageset.projects.metio.wtf/comparisons/helm/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/comparisons/helm/</guid><description>&lt;p&gt;&lt;a href="https://helm.sh/"&gt;Helm&lt;/a&gt; is two things: a templating engine (charts) and an
imperative release tool (&lt;code&gt;helm upgrade&lt;/code&gt;). &lt;code&gt;StageSet&lt;/code&gt; is neither — it&amp;rsquo;s a declarative
delivery controller. The overlap is ordering: Helm&amp;rsquo;s hooks and hook weights give you
&lt;em&gt;some&lt;/em&gt; sequencing inside a single chart&amp;rsquo;s install/upgrade.&lt;/p&gt;
&lt;h2 id="what-helm-gives-you"&gt;What Helm gives you&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Templated, parameterized manifests (charts and values).&lt;/li&gt;
&lt;li&gt;Install/upgrade ordering via &lt;code&gt;helm.sh/hook&lt;/code&gt; (pre-install, post-upgrade, …) and
&lt;code&gt;hook-weight&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;A release history you can roll back to with &lt;code&gt;helm rollback&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="where-stageset-differs"&gt;Where StageSet differs&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Continuous reconciliation.&lt;/strong&gt; &lt;code&gt;helm upgrade&lt;/code&gt; is a point-in-time, imperative
action; nothing re-asserts the state afterward. &lt;code&gt;StageSet&lt;/code&gt; reconciles on an
interval, corrects drift, and prunes — it&amp;rsquo;s GitOps, not a one-shot.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ordering across artifacts, not just within one chart.&lt;/strong&gt; Helm hooks order
resources &lt;em&gt;inside&lt;/em&gt; a release. &lt;code&gt;StageSet&lt;/code&gt; orders whole &lt;em&gt;stages&lt;/em&gt;, each its own
artifact, with readiness gating between them.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Typed gates between steps.&lt;/strong&gt; Hooks run Jobs; &lt;code&gt;StageSet&lt;/code&gt; stages can run Jobs,
HTTP gates, waits, patches, deletes, and transient applies, as pre/post/onFailure
&lt;a href="https://stageset.projects.metio.wtf/usage/actions/"&gt;actions&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Identity.&lt;/strong&gt; A &lt;code&gt;StageSet&lt;/code&gt; applies under an impersonated, per-tenant
&lt;code&gt;ServiceAccount&lt;/code&gt;; &lt;code&gt;helm upgrade&lt;/code&gt; runs as whoever ran it.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="using-them-together"&gt;Using them together&lt;/h2&gt;
&lt;p&gt;Render a chart to manifests (e.g. via a producer that publishes an
&lt;code&gt;ExternalArtifact&lt;/code&gt;) and deliver it with &lt;code&gt;StageSet&lt;/code&gt;. The controller understands
&lt;code&gt;helm.sh/hook&lt;/code&gt; resources: &lt;code&gt;applyHelmHookResources&lt;/code&gt; (default &lt;code&gt;true&lt;/code&gt;) applies them as
ordinary objects, so a Helm-style chart&amp;rsquo;s hook resources still get created — now
under &lt;code&gt;StageSet&lt;/code&gt;&amp;rsquo;s ordering and gating instead of Helm&amp;rsquo;s.&lt;/p&gt;</description></item><item><title>StageSet vs jsonnet-controller</title><link>https://stageset.projects.metio.wtf/comparisons/jsonnet-controller/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/comparisons/jsonnet-controller/</guid><description>&lt;p&gt;&lt;a href="https://github.com/pelotech/jsonnet-controller"&gt;jsonnet-controller&lt;/a&gt; (pelotech) is a
Flux controller that evaluates Jsonnet (kubecfg- and Tanka-style) and applies the
result to the cluster. Its &lt;code&gt;Konfiguration&lt;/code&gt; resource (&lt;code&gt;jsonnet.io/v1beta1&lt;/code&gt;) is, in
effect, &lt;em&gt;kustomize-controller for Jsonnet&lt;/em&gt;: point it at a &lt;code&gt;GitRepository&lt;/code&gt; (or an
HTTP(S) URL), and it builds the Jsonnet and reconciles the manifests — with
pruning, health/revision tracking, TLA string/code variables, and &lt;code&gt;dependsOn&lt;/code&gt;
ordering &lt;strong&gt;between&lt;/strong&gt; Konfigurations.&lt;/p&gt;
&lt;p&gt;The two projects sit at &lt;strong&gt;different layers&lt;/strong&gt;, which is the whole comparison.&lt;/p&gt;</description></item><item><title>StageSet vs Kustomize</title><link>https://stageset.projects.metio.wtf/comparisons/kustomize/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/comparisons/kustomize/</guid><description>&lt;p&gt;&lt;a href="https://kustomize.io/"&gt;Kustomize&lt;/a&gt; (the &lt;code&gt;kustomize&lt;/code&gt; CLI / &lt;code&gt;kubectl kustomize&lt;/code&gt;) is a
manifest &lt;em&gt;builder&lt;/em&gt;: it composes bases and overlays, applies patches, and emits YAML.
It does not apply anything, and it has no notion of ordering, readiness, or
reconciliation — that&amp;rsquo;s &lt;code&gt;kubectl apply&lt;/code&gt;&amp;rsquo;s job, and &lt;code&gt;kubectl&lt;/code&gt; applies everything at
once.&lt;/p&gt;
&lt;h2 id="what-kustomize-gives-you"&gt;What Kustomize gives you&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Overlay composition, strategic-merge and JSON6902 patches, variable replacement,
generators.&lt;/li&gt;
&lt;li&gt;A pure transformation: in goes a kustomization, out come manifests.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="where-stageset-differs"&gt;Where StageSet differs&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;It delivers, not just builds.&lt;/strong&gt; Kustomize stops at YAML. &lt;code&gt;StageSet&lt;/code&gt; applies it,
waits for health, prunes what&amp;rsquo;s gone, and keeps doing so.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ordering and gates.&lt;/strong&gt; &lt;code&gt;kubectl apply -k&lt;/code&gt; has no stages and no gates. &lt;code&gt;StageSet&lt;/code&gt;
sequences stages and runs &lt;a href="https://stageset.projects.metio.wtf/usage/actions/"&gt;actions&lt;/a&gt; between them.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Continuous reconciliation and drift correction&lt;/strong&gt;, versus a one-shot &lt;code&gt;apply&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="using-them-together"&gt;Using them together&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;StageSet&lt;/code&gt; &lt;em&gt;includes&lt;/em&gt; the parts of Kustomize you reach for at delivery time: a stage
has &lt;code&gt;path&lt;/code&gt;, &lt;code&gt;patches&lt;/code&gt;, and &lt;code&gt;postBuild&lt;/code&gt; substitution
(&lt;a href="https://stageset.projects.metio.wtf/usage/stages-and-sources/"&gt;stages and sources&lt;/a&gt;). So you can keep authoring with
Kustomize overlays and let a stage apply the right overlay, patched and
substituted — then add the ordering, gating, and reconciliation Kustomize alone
doesn&amp;rsquo;t offer.&lt;/p&gt;</description></item><item><title>StageSet vs Tanka and kubecfg</title><link>https://stageset.projects.metio.wtf/comparisons/tanka-kubecfg/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/comparisons/tanka-kubecfg/</guid><description>&lt;p&gt;&lt;a href="https://tanka.dev/"&gt;Tanka&lt;/a&gt; and &lt;a href="https://github.com/kubecfg/kubecfg"&gt;kubecfg&lt;/a&gt; are
Jsonnet-based config tools: you express your resources in Jsonnet, the tool renders
them, diffs against the cluster, and applies. They generate configuration and run a
CLI-driven apply, but they are imperative tools you run, not controllers that
reconcile.&lt;/p&gt;
&lt;h2 id="what-tanka--kubecfg-give-you"&gt;What Tanka / kubecfg give you&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Jsonnet-powered, DRY manifest generation (libraries, abstractions, environments).&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;diff&lt;/code&gt;/&lt;code&gt;apply&lt;/code&gt; workflow with dependency-aware ordering of a single apply.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="where-stageset-differs"&gt;Where StageSet differs&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Reconciliation, not invocation.&lt;/strong&gt; Tanka/kubecfg apply when &lt;em&gt;you&lt;/em&gt; run them.
&lt;code&gt;StageSet&lt;/code&gt; runs in-cluster and continuously reconciles, corrects drift, and
prunes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Staged, gated delivery.&lt;/strong&gt; They apply a rendered set (in dependency order);
they don&amp;rsquo;t model multi-stage rollouts with readiness gates, update windows, or
versioned migrations between stages.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GitOps identity and tenancy.&lt;/strong&gt; &lt;code&gt;StageSet&lt;/code&gt; applies under an impersonated tenant
&lt;code&gt;ServiceAccount&lt;/code&gt; inside the cluster; Tanka/kubecfg use your local credentials.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="using-them-together"&gt;Using them together&lt;/h2&gt;
&lt;p&gt;The Jsonnet &lt;em&gt;generation&lt;/em&gt; that Tanka and kubecfg do so well has a GitOps-native
equivalent in two related projects:&lt;/p&gt;</description></item><item><title>stagesetctl build</title><link>https://stageset.projects.metio.wtf/cli/build/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/cli/build/</guid><description>&lt;p&gt;Runs the same resolve → fetch → build pipeline the controller uses and writes the
result — a multi-document YAML stream — to stdout. This is what would be applied,
before it is applied. To preview the change against live cluster state instead, use
&lt;a href="https://stageset.projects.metio.wtf/cli/diff/"&gt;&lt;code&gt;diff&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;stagesetctl build NAME [flags]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Flag&lt;/th&gt;
					&lt;th&gt;Default&lt;/th&gt;
					&lt;th&gt;Description&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--stage&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;em&gt;(all)&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;Render only the named stage(s); repeatable.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--source-dir&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;em&gt;(none)&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;Use a local artifact tree as &lt;code&gt;[STAGE=]PATH&lt;/code&gt; instead of fetching from the cluster; repeatable.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--show-secrets&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Reveal Secret values instead of masking them.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--as-tenant&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Render impersonating the StageSet&amp;rsquo;s &lt;code&gt;spec.serviceAccountName&lt;/code&gt; (see &lt;a href="https://stageset.projects.metio.wtf/usage/multi-cluster/"&gt;multi-cluster and tenancy&lt;/a&gt;).&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Secret values are masked by default, so the output is safe to paste into a review.
&lt;code&gt;build&lt;/code&gt; writes YAML unconditionally — there is no output-format flag.&lt;/p&gt;</description></item><item><title>stagesetctl diff</title><link>https://stageset.projects.metio.wtf/cli/diff/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/cli/diff/</guid><description>&lt;p&gt;By default &lt;code&gt;diff&lt;/code&gt; performs a
&lt;a href="https://kubernetes.io/docs/reference/using-api/server-side-apply/"&gt;server-side&lt;/a&gt;
dry-run apply and exits &lt;code&gt;1&lt;/code&gt; when there are changes, so it works as a CI gate. It
shows, per object, what a reconcile would create, configure, or delete, plus the
&lt;a href="https://stageset.projects.metio.wtf/usage/actions/"&gt;actions&lt;/a&gt; a rollout would run. To see the full rendered manifests
without comparing against the cluster, use &lt;a href="https://stageset.projects.metio.wtf/cli/build/"&gt;&lt;code&gt;build&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;stagesetctl diff NAME [flags]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Flag&lt;/th&gt;
					&lt;th&gt;Default&lt;/th&gt;
					&lt;th&gt;Description&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--stage&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;em&gt;(all)&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;Diff only the named stage(s); repeatable.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--source-dir&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;em&gt;(none)&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;Use a local artifact tree as &lt;code&gt;[STAGE=]PATH&lt;/code&gt;; repeatable. Skips the cluster fetch.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--server-side&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;true&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Server-side dry-run apply diff (needs update/patch RBAC). &lt;code&gt;false&lt;/code&gt; renders client-side against live objects.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--as-tenant&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Render and dry-run impersonating &lt;code&gt;spec.serviceAccountName&lt;/code&gt; (see &lt;a href="https://stageset.projects.metio.wtf/usage/multi-cluster/"&gt;multi-cluster and tenancy&lt;/a&gt;).&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--show-secrets&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Reveal Secret values instead of masking.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--show-unchanged&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Include objects with no change.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--prune&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;true&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Show resources that would be deleted (fell out of inventory).&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--color&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;auto&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Colorize output: &lt;code&gt;auto&lt;/code&gt;, &lt;code&gt;always&lt;/code&gt;, or &lt;code&gt;never&lt;/code&gt;.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--exit-code&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;true&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Exit &lt;code&gt;1&lt;/code&gt; when changes are found. &lt;code&gt;false&lt;/code&gt; always exits &lt;code&gt;0&lt;/code&gt; on a clean run.&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="example"&gt;Example&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;stagesetctl diff payments
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;--- live
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+++ merged
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;@@ Deployment payments/web @@
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; spec:
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- replicas: 3
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;+ replicas: 6
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- ConfigMap payments/old-feature-flags (pruned: fell out of inventory)
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Actions to run:
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; application:
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; pre db-migrate job ledger-migrations
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; post smoke-test http https://payments.internal/healthz
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Objects that left the stage&amp;rsquo;s &lt;a href="https://stageset.projects.metio.wtf/api/stageinventory/"&gt;inventory&lt;/a&gt; show as deletions
(&lt;code&gt;pruned: …&lt;/code&gt;); pass &lt;code&gt;--prune=false&lt;/code&gt; to hide them. The trailing &lt;code&gt;Actions to run&lt;/code&gt;
block lists the &lt;a href="https://stageset.projects.metio.wtf/usage/actions/"&gt;pre/post/onFailure actions&lt;/a&gt; a real reconcile
would execute — &lt;code&gt;diff&lt;/code&gt; never runs them, it only reports them.&lt;/p&gt;</description></item><item><title>stagesetctl get</title><link>https://stageset.projects.metio.wtf/cli/get/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/cli/get/</guid><description>&lt;p&gt;With no &lt;code&gt;NAME&lt;/code&gt;, lists StageSets in the current namespace. With a &lt;code&gt;NAME&lt;/code&gt;, prints that
StageSet&amp;rsquo;s detail (Ready reason, per-stage phase, revisions, version) — a readable
view of &lt;a href="https://stageset.projects.metio.wtf/api/stageset/#status"&gt;&lt;code&gt;StageSet.status&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;stagesetctl get [NAME] [flags]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Flag&lt;/th&gt;
					&lt;th&gt;Default&lt;/th&gt;
					&lt;th&gt;Description&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;-A&lt;/code&gt;, &lt;code&gt;--all-namespaces&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;List StageSets across all namespaces.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;-o&lt;/code&gt;, &lt;code&gt;--output&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;em&gt;(table)&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;Output format: empty for the human table, or &lt;code&gt;yaml&lt;/code&gt; / &lt;code&gt;json&lt;/code&gt;.&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="listing"&gt;Listing&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;stagesetctl get -A
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;NAMESPACE NAME READY REASON STAGES VERSION PENDING
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;payments payments True Succeeded 2/2 2.1.0 -
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;platform platform True Succeeded 3/3 - -
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;staging web False StageFailed 1/2 - -
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;code&gt;STAGES&lt;/code&gt; is &lt;code&gt;ready/total&lt;/code&gt;; &lt;code&gt;PENDING&lt;/code&gt; shows &lt;code&gt;held until &amp;lt;time&amp;gt;&lt;/code&gt; when an
&lt;a href="https://stageset.projects.metio.wtf/usage/update-windows/"&gt;update window&lt;/a&gt; is holding a rollout. A &lt;code&gt;False&lt;/code&gt; &lt;code&gt;READY&lt;/code&gt;
maps to a &lt;a href="https://stageset.projects.metio.wtf/runbooks/"&gt;runbook&lt;/a&gt; by its &lt;code&gt;REASON&lt;/code&gt;.&lt;/p&gt;</description></item><item><title>stagesetctl reconcile</title><link>https://stageset.projects.metio.wtf/cli/reconcile/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/cli/reconcile/</guid><description>&lt;p&gt;Stamps the &lt;code&gt;reconcile.fluxcd.io/requestedAt&lt;/code&gt;
&lt;a href="https://stageset.projects.metio.wtf/api/stageinventory/#well-known-labels-and-annotations"&gt;annotation&lt;/a&gt; to trigger a
reconcile now, optionally waiting for the controller to report it handled.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;stagesetctl reconcile NAME [flags]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Flag&lt;/th&gt;
					&lt;th&gt;Default&lt;/th&gt;
					&lt;th&gt;Description&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--stage&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;em&gt;(all)&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;Force only this stage to re-run its actions (single-stage reconcile).&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--with-source&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Also re-request the stage sources before reconciling.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--update-now&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Apply a window-held rollout immediately, bypassing update windows.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--force&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Proceed even when the StageSet is suspended.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--wait&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;Block until the controller reports the request handled.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;code&gt;--timeout&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;&lt;code&gt;5m&lt;/code&gt;&lt;/td&gt;
					&lt;td&gt;How long to wait with &lt;code&gt;--wait&lt;/code&gt;.&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="example"&gt;Example&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;stagesetctl reconcile payments -n payments
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Reconcile requested for StageSet payments (token 2026-06-15T09:30:00Z)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Force just one stage to re-run its actions:&lt;/p&gt;</description></item><item><title>Stalled</title><link>https://stageset.projects.metio.wtf/runbooks/stalled/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/stalled/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=False&lt;/code&gt;, &lt;code&gt;REASON=Stalled&lt;/code&gt;. Terminal: the controller does not requeue until the spec changes.&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;A condition that retrying cannot clear. Currently this is a &lt;strong&gt;&lt;code&gt;spec.dependsOn&lt;/code&gt; cycle&lt;/strong&gt; — two or more StageSets depend on each other (directly or transitively), so none can ever become Ready first. The cycle is detected by a breadth-first walk over the &lt;code&gt;dependsOn&lt;/code&gt; graph. A dependency that is merely not Ready yet (no cycle) reports &lt;a href="https://stageset.projects.metio.wtf/runbooks/dependencynotready/"&gt;&lt;code&gt;DependencyNotReady&lt;/code&gt;&lt;/a&gt; instead.&lt;/p&gt;
&lt;h2 id="diagnosis"&gt;Diagnosis&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl describe stageset &amp;lt;name&amp;gt; -n &amp;lt;namespace&amp;gt; &lt;span class="c1"&gt;# Message states &amp;#34;spec.dependsOn forms a cycle&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Trace the edges:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl get stageset -n &amp;lt;namespace&amp;gt; &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; -o custom-columns&lt;span class="o"&gt;=&lt;/span&gt;NAME:.metadata.name,DEPENDSON:.spec.dependsOn&lt;span class="o"&gt;[&lt;/span&gt;*&lt;span class="o"&gt;]&lt;/span&gt;.name
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Follow the &lt;code&gt;dependsOn&lt;/code&gt; names until you find the loop (A → B → A, or longer).&lt;/p&gt;</description></item><item><title>Succeeded</title><link>https://stageset.projects.metio.wtf/runbooks/succeeded/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/succeeded/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=True&lt;/code&gt;, &lt;code&gt;REASON=Succeeded&lt;/code&gt;. The Message names the applied revisions.&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;This is the healthy steady state: every stage&amp;rsquo;s artifact resolved, built, applied, and passed its readiness checks, and &lt;code&gt;status.lastAppliedRevisions&lt;/code&gt; matches &lt;code&gt;status.lastAttemptedRevisions&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id="remediation"&gt;Remediation&lt;/h2&gt;
&lt;p&gt;Nothing to remediate.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The controller keeps reconciling at &lt;code&gt;spec.interval&lt;/code&gt;; a re-render upstream (a new &lt;code&gt;ExternalArtifact&lt;/code&gt; revision) re-applies automatically and the condition stays &lt;code&gt;Succeeded&lt;/code&gt; once the new revision converges.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;status.stages[]&lt;/code&gt; reports per-stage &lt;code&gt;appliedRevision&lt;/code&gt; and inventory entry counts to confirm what each stage owns.&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Suspended</title><link>https://stageset.projects.metio.wtf/runbooks/suspended/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/suspended/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=False&lt;/code&gt;, &lt;code&gt;REASON=Suspended&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;spec.suspend: true&lt;/code&gt; is set, so the controller short-circuits before any resolution, build, or apply. This is an intentional operator action, not a failure — applied objects are left exactly as they were at the last successful run.&lt;/p&gt;
&lt;h2 id="remediation"&gt;Remediation&lt;/h2&gt;
&lt;p&gt;Resume by clearing the flag:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl patch stageset &amp;lt;name&amp;gt; -n &amp;lt;namespace&amp;gt; --type&lt;span class="o"&gt;=&lt;/span&gt;merge -p &lt;span class="s1"&gt;&amp;#39;{&amp;#34;spec&amp;#34;:{&amp;#34;suspend&amp;#34;:false}}&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The next reconcile picks up from the current artifact revisions.&lt;/p&gt;</description></item><item><title>Update windows</title><link>https://stageset.projects.metio.wtf/usage/update-windows/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/usage/update-windows/</guid><description>&lt;p&gt;Update windows gate &lt;em&gt;when&lt;/em&gt; new artifact revisions roll out, without pausing
reconciliation. Drift correction keeps running; only the rollout of a &lt;em&gt;new&lt;/em&gt;
revision is held until a window allows it.&lt;/p&gt;
&lt;h2 id="deny-a-recurring-window"&gt;Deny a recurring window&lt;/h2&gt;
&lt;p&gt;Freeze rollouts during business hours:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;stages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;sourceRef&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;my-app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;updateWindows&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Deny&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;schedule&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;0 9 * * MON-FRI&amp;#34;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# 5-field cron: start of the window&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;8h&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;timeZone&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Europe/Berlin&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;A new revision that arrives inside the window is held; &lt;code&gt;status.pendingUpdate&lt;/code&gt;
records what is waiting and &lt;code&gt;nextWindowOpens&lt;/code&gt; when it will ship. The controller
emits an &lt;code&gt;UpdateDeferred&lt;/code&gt; event and increments &lt;code&gt;stageset_update_deferred_total&lt;/code&gt;.&lt;/p&gt;</description></item><item><title>UpdateDeferred</title><link>https://stageset.projects.metio.wtf/runbooks/updatedeferred/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/updatedeferred/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;READY=False&lt;/code&gt;, &lt;code&gt;REASON=UpdateDeferred&lt;/code&gt; (initial deploy held), or &lt;code&gt;READY=True&lt;/code&gt; with a message noting a deferral and a populated &lt;code&gt;status.pendingUpdate&lt;/code&gt; (an already-deployed StageSet with a held update).&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;This is &lt;strong&gt;not a failure&lt;/strong&gt; — it is time-based delivery working as configured. A new revision (or the first deploy) is being held because the StageSet&amp;rsquo;s &lt;a href="https://stageset.projects.metio.wtf/usage/update-windows/"&gt;&lt;code&gt;spec.updateWindows&lt;/code&gt;&lt;/a&gt; do not currently permit a rollout: either a &lt;code&gt;Deny&lt;/code&gt; window is active, or &lt;code&gt;Allow&lt;/code&gt; windows are declared and none is active right now. With &lt;code&gt;spec.windowScope: All&lt;/code&gt;, even drift correction is paused while a window is closed.&lt;/p&gt;</description></item><item><title>Versioned migrations</title><link>https://stageset.projects.metio.wtf/usage/versioned-migrations/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/usage/versioned-migrations/</guid><description>&lt;p&gt;Some changes only need to happen once, when you cross a release boundary — a
one-time data backfill on the way to 2.0, a schema conversion between 1.x and 2.x.
Versioned migrations run a ladder of &lt;a href="https://stageset.projects.metio.wtf/usage/actions/"&gt;actions&lt;/a&gt; exactly when the
deployed version steps over the boundary, and never again.&lt;/p&gt;
&lt;p&gt;Versioning is off until you set &lt;code&gt;spec.version&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id="declaring-the-version"&gt;Declaring the version&lt;/h2&gt;
&lt;p&gt;The controller needs to know &lt;em&gt;what version is currently being deployed&lt;/em&gt;. There are
three ways to declare it; pick by &lt;strong&gt;where the version lives&lt;/strong&gt;.&lt;/p&gt;</description></item><item><title>Webhook cert renewal failing</title><link>https://stageset.projects.metio.wtf/runbooks/webhook-cert-renewal/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/webhook-cert-renewal/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;stageset_webhook_cert_renewal_failures_total&lt;/code&gt; is increasing; the
&lt;code&gt;StageSetWebhookCertRenewalFailing&lt;/code&gt; alert fires (see
&lt;a href="https://stageset.projects.metio.wtf/installation/operations/"&gt;operations&lt;/a&gt; for the alert set and its thresholds).
The current certificate keeps working until its natural expiry — that expiry is
the deadline, after which cluster-wide &lt;code&gt;StageSet&lt;/code&gt; admission breaks.&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;Only applies in &lt;code&gt;--webhook-cert-mode=self-signed&lt;/code&gt;. The in-pod renewer regenerates
the serving cert every &lt;code&gt;validity/3&lt;/code&gt; and patches the
&lt;code&gt;ValidatingWebhookConfiguration&lt;/code&gt;&amp;rsquo;s &lt;code&gt;caBundle&lt;/code&gt;. It fails when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the controller lost &lt;code&gt;update&lt;/code&gt; (or &lt;code&gt;get&lt;/code&gt;) on the named
&lt;code&gt;ValidatingWebhookConfiguration&lt;/code&gt; (&lt;code&gt;--webhook-validating-config-name&lt;/code&gt;),&lt;/li&gt;
&lt;li&gt;the VWC was renamed and the flag/&lt;code&gt;resourceNames&lt;/code&gt; weren&amp;rsquo;t updated,&lt;/li&gt;
&lt;li&gt;the cert directory (&lt;code&gt;--webhook-cert-dir&lt;/code&gt;) became read-only.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In &lt;code&gt;cert-manager&lt;/code&gt; mode this metric is irrelevant — &lt;a href="https://cert-manager.io/"&gt;cert-manager&lt;/a&gt; owns renewal.&lt;/p&gt;</description></item><item><title>Workqueue saturation</title><link>https://stageset.projects.metio.wtf/runbooks/workqueue-saturation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://stageset.projects.metio.wtf/runbooks/workqueue-saturation/</guid><description>&lt;h2 id="symptom"&gt;Symptom&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;workqueue_depth{controller=&amp;quot;stageset&amp;quot;}&lt;/code&gt; stays high; StageSets reconcile slowly or
lag behind their &lt;code&gt;spec.interval&lt;/code&gt;. The &lt;code&gt;StageSetControllerWorkqueueDepthHigh&lt;/code&gt; alert
fires (see &lt;a href="https://stageset.projects.metio.wtf/installation/operations/"&gt;operations&lt;/a&gt; for the alert set and its
thresholds).&lt;/p&gt;
&lt;h2 id="cause"&gt;Cause&lt;/h2&gt;
&lt;p&gt;The controller is enqueuing reconcile requests faster than it completes them.
Common causes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;apiserver slowness&lt;/strong&gt; — applies, dry-runs, and status writes all block on the
apiserver (or the impersonated tenant&amp;rsquo;s authorization).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;slow sources&lt;/strong&gt; — a stage waiting on a large artifact fetch or a source that is
slow to become Ready holds a worker.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;a stuck stage&lt;/strong&gt; — an action with a long timeout (a &lt;code&gt;wait&lt;/code&gt;/&lt;code&gt;http&lt;/code&gt;/&lt;code&gt;job&lt;/code&gt; that
never completes) pins a worker for the whole timeout.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;too few workers for the StageSet count&lt;/strong&gt; — many StageSets reconciling on short
intervals.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="diagnosis"&gt;Diagnosis&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-shell" data-lang="shell"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# which StageSets are churning?&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl get stagesets -A --sort-by&lt;span class="o"&gt;=&lt;/span&gt;.status.observedGeneration
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# controller logs for slow operations / retries&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl -n stageset-system logs deploy/stageset-controller --tail&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Correlate with &lt;code&gt;controller_runtime_reconcile_time_seconds&lt;/code&gt; (see
&lt;a href="https://stageset.projects.metio.wtf/runbooks/reconcile-latency/"&gt;reconcile latency&lt;/a&gt;) and apiserver latency.&lt;/p&gt;</description></item></channel></rss>