Address review feedback: when a caller passes a negative value to
WithStatusComputeWorkers, coerce it to zero rather than propagating
it to the underlying cli-utils watcher, where the behavior is
undefined. Zero is the safe default and matches the SDK opt-in
contract.
Signed-off-by: maplemiao <maplemiao@tencent.com>
Replace the package-level DefaultStatusComputeWorkers variable with a
WithStatusComputeWorkers WaitOption threaded through waitOptions into
the statusWaiter. This removes global mutable state from pkg/kube and
lets callers opt in explicitly.
SDK consumers (e.g. helm-controller) inherit the zero value, which
preserves the upstream cli-utils synchronous behavior and avoids an
unexpected fan-out of status-compute goroutines when many releases
reconcile concurrently. The Helm CLI continues to enable 8 workers by
default via a shared pkg/cmd/flags.go helper, so install/upgrade/
rollback/uninstall/test users still get the fix for multi-minute
informer stalls out of the box.
Signed-off-by: maplemiao <maplemiao@tencent.com>
Set StatusComputeWorkers=8 on DefaultStatusWatcher for Wait,
WaitWithJobs, and WatchUntilReady. This opts in to the async status
computation added in fluxcd/cli-utils#20, preventing the informer
notification pipeline from being blocked by slow API calls when many
resources are updated simultaneously.
Without this, status computation for resources like Deployments (which
require additional LIST ReplicaSets/Pods calls) runs serially inside
the informer, causing growing delays of 1-3+ minutes when upgrading
many resources at once (e.g., ~20 Deployments via Helm).
Signed-off-by: maplemiao <maplemiao@tencent.com>
The previous fix (increasing timeout / reducing deletion delay) did not
work because the flakiness is not a timing problem at all.
Root cause: fluxcd/cli-utils HasSynced() returns true after the initial
list item is *popped* from DeltaFIFO, which is before AddFunc delivers
the ResourceUpdateEvent to the collector. This creates a race where the
SyncEvent can arrive at the statusObserver *before* the pod's Current
status is recorded. When that happens:
- statusObserver sees pod as Unknown
- Unknown is skipped for WaitForDelete (by design, to handle resources
that were already deleted before watching started)
- AggregateStatus([], NotFoundStatus) == NotFoundStatus → cancel()
- The watch context is cancelled before DeleteFunc can fire
- Final check: pod still Current → error
The test intent is to verify that waitForDeleteCtx (not the cancelled
generalCtx) is selected. A non-existent resource satisfies this:
- With waitForDeleteCtx=Background(): informer syncs with empty list
→ Unknown → cancel → success ✓
- With generalCtx (cancelled, wrong): context immediately done
→ ctx.Err() appended → error returned ✓
Remove the goroutine-based deletion and the pod creation to eliminate
the race while preserving the context-selection assertion.
Signed-off-by: Terry Howe <terrylhowe@gmail.com>
TestMethodContextOverridesGeneralContext/WaitForDelete used a 1s
timeout with a 500ms deletion delay, leaving only ~500ms for the
fake watcher to propagate the delete event. On loaded CI runners
this window is too tight and causes intermittent failures.
Increase the timeout to 5s and reduce the deletion delay to 100ms
so there is ample headroom. Apply the same deletion-delay reduction
to TestStatusWaitForDelete which shares the same pattern.
Signed-off-by: Terry Howe <terrylhowe@gmail.com>
The toTOML doc comment said "returns empty string on marshal error"
but the implementation actually returns err.Error(). Fix the comment
to match the real behavior. Also mention mustToToml as the strict
alternative.
Signed-off-by: Ilya Kiselev <kis-ilya-a@yandex.ru>
Add `mustToToml` that panics on marshal error, consistent with
`mustToYaml` and `mustToJson`. This makes it possible for chart authors
to get a hard failure when TOML serialization fails, rather than having
to inspect the output manually.
`toToml` behavior is unchanged in this commit.
Closes#31430
Signed-off-by: Ilya Kiselev <kis-ilya-a@yandex.ru>
Remove pre-Go modules import path comments from pkg/kube test files
(ready_test.go, resource_test.go, statuswait_test.go) for consistency
with the rest of the package.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Terry Howe <terrylhowe@gmail.com>
Import path comments (e.g. `// import "helm.sh/helm/v4/pkg/kube"`) are
a pre-Go modules convention no longer needed in module-aware builds.
Some files in pkg/kube had these comments while others did not, causing
inconsistency that triggered downstream Kythe indexing errors.
Remove the import comments from all affected files to make the package
declaration consistent across the directory.
Fixes#31846
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: abhay1999 <abhaychaurasiya19@gmail.com>